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PREFACE 


The authors^ aim has been to present, between the covers of a single 
book, those parts of mathematics which form the tools of the modern 
"Worker in theoretical physics and chemistry. They have endeavored to 
do this by steering a middle course between the mere recording of facts and 
formulas which is typical of handbook treatments, and the ponderous 
development which characterizes treatises in special fields. Therefore, 
as far as space permitted, all results have been embedded in the logical 
texture of proofs. Occasionally, when full demonstrations are lengthy or 
not particularly illuminating with respect to the subject at hand, they 
have been omitted in favor of references to the literature. Except for the 
first chapter, which is primarily a survey, proofs have always been given 
where omission would destroy the continuity of treatment. 

Arbitrary selection of topics has been necessary for lack of space. This 
was based partly on the authors^ opinions as to the relevance of vairious 
subjects, partly on the results of consultations with colleagues. The 
degree of difficulty of the treatment is such that a Senior majoring in physics 
or chemistry would be able to read most parts of the book with under- 
standing. 

While inclusion of large collections qf routine problems did not seem 
conformable to the purpose of the book, the" authors*’ have felt^that its 
usefulness might be augmented by tw minor pedagogical deVfjfe^: the 
insertion here and there of frily worked examples liltistrative of the theory 
under discussion, and the dispersal, throughout the book, of special prob- 
lems confirming, and in some cases supplementing, the ideas of the text. 
Answers to the problems are usually given. 

The degree of rigor to which we have aspired is that customary in 
careful scientific demonstrations, not the lofty heights accessible tx^ the 
pure mathematician. For this we make no apology; if the history of tjie 
exact sciences teaches anything it is that emphasis on extreme rigor olt^ 
engenders sterility, and that the successful pioneer depends morft cTa 
brilliant hunches than on the results of existence theorems. We trmlt, ot 
course, that our effort to avoid rigor mortis has not brought ua mffiffeiv 
ously close to the opposite extreme of sloppy reasoning. 

A careful attempt has been made to insure continuity of prei^ntalilli 
within each chapter, and as far as possible throughout the bdKk. Tii 
diversity of the subjects has made it necessary to refer occaHoHElty 
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chapters ahead. Whenever this occurs it is done reluctantly and in order 
to avoid repetition. 

As to form, considerations of literacy have often been given secondary 
rank in favor of conciseness and brevity, and no .great attempt has been 
made to disguise individual authorship by artificially uniformising the 
style. 

The authors have used the material of several of the chapters in a num- 
ber of special courses and have found its collection into a single volume 
convenient. To venture a few specific suggestions, the book, if it were 
judged favorably by mathematicians, would serve as a foundation for 
courses in applied mathematics on the senior and first year graduate level. 
A thorough introductory course in quantum mechanics could be based on 
chapter 2, parts of 3, 8 and 10, and chapter 11. Chapters 1, 10 and parts 
of 11 may be used in a short course which reviews thermodynamics and 
then treats statistical mechanics. Reading of chapters 4, 9, and 15 would 
prepare for an understanding of special treatments dealing with polyatomic 
molecules, and the liquid and ^olid state. Since ability to handle numeri- 
cal computations is very important in all branches of physics and chemistry, 
a chapter designed to familiarize the reader with all tools likely to be needed 
in such work has been included. 

The index has been made sufficiently complete so that the book can 
serve as a ready reference to definitions, theorems and proofs. Graduate 
students and scientists whose memory of specific mathematical details is 
dimmed may find it useful in review. Last, but not least, the authors 
have had in mind the adventurous student of physics and chemistry who 
wishes to improve his mathematical knowledge through self-study. 


New Haven, Conn. 
March^ 1943 


Henry Margenau 
George M. Murphy 
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CHAPTER 1 

THE MATHEMATICS OF THERMODYNAMICS 

Most of the chapters of this book endeavor to treat some single mathe- 
matical method in a systematic manner. The subject of thermodynamics, 
being highly empirical and synoptic in its contents, does not contain a very 
uniform method of analysis. Nevertheless, it involves mathematical 
elements of considerable interest, chiefly centered about partial differentia- 
tion. Rather than omit these entirely from consideration, it seemed well 
to devote the present chapter to them. Of necessity, the treatment is 
perhaps less systematic than elsewhere. It is placed at the beginning 
because most readers are likely to have some familiarity with the subject 
and because the mathematical methods are simple. (A reading of the first 
chapter is not essential for an understanding of the remainder of the book.) 

1.1. Introduction. — The science of thermod 5 aiamics is concerned with 
the laws that govern the transformations of energy of one kind into another 
during physical or chemical changes. These changes are assumed to occur 
within a thermodynamic system which is completely isolated from its sur- 
roundings. Such a system is described by means of thermodynamic variables 
which are of two kinds. Extensive variables are proportional to the amount 
of matter which is being considered; typical examples are the volume or the 
total energy of the system. Variables which are independent of the amount 
of matter present, such as pressure or temperature, are called intensive 
variables. 

It is found experimentally that it is not possible to change all of these 
variables independently, for if certain ones of them are held constant, the 
remaining ones are automatically fixed in value. Mathematically, such a 
situation is treated by the method of partial differentiation. Furthermore, 
a certain type of differential, called the exact differential and an integral, 
known as the line integral are of great importance in the study of thermo- 
dynamics. We propose to describe these matters in a general way and to 
apply them to a few specific problems. We assume that the reader is 
familiar with the general ideas of thermodynamics and refer him to other 
sources^ for a more complete treatment of the physical details. 

^J. Williard Gibbs, Transactions of the Conn. Acad. (1876-1878); "Scientific 
Papers of Willard Gibbs, Vol. 1., Longmans and Co. Some recent texts are: Epstein, 
" Textbook of Thermodynamics,” John Wiley and Sons, New York, 1937 ; MacDougall, 
" Thermod 3 mamics and Chemistry,” Third Edition, John Wiley and Sons, New York, 
1939; Steiner, "Introduction to Chemical Thermodynamics,” McGraw-Hill Book 
Co., New York, 1941, Zemansky, “ Heat and Thermodynamics,” McGraw-Hill, N.Y., 
1937. 
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1.2. Differentiation of Functions of Several Independent Variables. — If 

« is a single-valued function of two real, independent variables, x and y, 

z = f{x,y) 


z is said to be an explicit function of x and y. The relation between the 
three variables may be represented by plotting x, y and z along the axes of a 
Cartesian coordinate system, the result being a surface. If we wish to 
study the motion of some point {x^y) over the surface, there are three 
possible cases: (a) x varies and y remains constant; (b) y varies, x remain- 
ing constant; (c) both x and y vary simultaneously. 

In the first and second cases, the path of the point will be along the 
curves'produced when planes, parallel to the XZ- or FZ-coordinate planes, 
intersect the original surface. If x is increased by the small quantity Ax 
and y remains constant, z changes from f{x,y) to f{x + Ax^y)^ and the 
partial derivative of z with respect to a: at the point {x^y) is defined by 


fx{x,y) = lim 

Ae— ►O 


/(x + Ax,j/) - f{x,y) 


Ax 


The following alternative notations are often used 

/■(*,!() - ^.(«,!/) - (D^ - (£)^ (1-1) 

where the constancy of y is indicated by the subscript. Since both x and y 
are completely independent, the partial derivative is evaluated by the 
usual method for the differentiation of a function of a single variable, y 
being treated as a constant. 

Defining the partial derivative of z with respect to y {x remaining com 
stant) in a similar way, we may write 



If 2 is a function of more than two variables 


Z =/(Xi,X2, • • ■,Xn) 

the simple geometric interpretation is lacking, but such a symbol as : 



still means that the function is to be differentiated with respect to xi by 
the usual rules, all other variables being considered as constants. 

Since the partial derivatives are themselves functions of the independent 
variables, they may be differentiated again to give second and higher 
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derivatives 


^ d / d2\ d^Z 

dx\dx/ dx^ 
d / 02\ d^Z 

dx \dy/ dxdl 

d / dA d^z 

dy \dx) dyd'j 

_ ^ 
\^y) dl/2 


It is not always true that f^y = fyx] but the order of differentiation is 
immaterial if the function and its derivatives are continuous. Since this is 
usually the case in physical applications, quantities such as /xy, fyx or 
fxxyy fxyxy fyxx will be Considered identical in the present treatment. 

1.3. Total Differentials. — In the third case of sec. 1.2, both x and y vary 
simultaneously or, in geometric language, the point moves along a curve 
determined by the intersection with z = f(x,y) of a surface which is neither 
parallel with the XZ- nor FZ- coordinate plane. Since x and y are inde- 
pendent, both Ax and Ay approach zero as Az approaches zero. In that 
case the change in z caused by increments Ax and Ay, called the total 
differential of z, is given by 




If it happens that x and y depend on a single independent variable u (it 
might be the arc length of the curve along which the point moves, or the 
time), 

2 =/(a:,2/); ^ = Fiiu); y = F 2 (w) 

then, from (4) 


du \dx)y du \dy/x du 


For the special case, 

« a; = F(y); y independent 


/ dzX dx / dz\ 
\dx)ydy \dy)^ 


An important generalization of these results arises when x, y, • • • are not 
independent variables but are each functions of a finite number of independ- 
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ent variables, u, v, 


f fix, y,z, ■ ■■) 

X = Fi(u,v,w, • • •) 

y = F2iu,v,w, • • •) 


Then, from (4) 


df = 


du + 


dv + 


(1-7) 


and from (5) 


v, u). * * * V, z, • • • V, w, • • • 

+ CA 

\^y/x. z. • • • \5w/r. 


+ 


(1-8) 


with similar expressions for (df/dv), (df/dw), ■ 

(7) we obtain 

r + rr + -V” + Tr? +f ?+ "I 

\jdx du dy du J J 


When these are put into 


- 

_^du 


dx 

dw + -“ dv + • 
dv 


jdx'^ Idu 


du H — ~ dv -f- 
dv 


■] 


dv + 

df 


dy 


+ 


(1-9) 


Since u, v, • • ■ are independent variables, we may write 

dx = — du + — dv + ■■ ■ 
du dv 

dy = — du + — dv + ■■ ■ 
du dv 

C!omparing coefficients in (9) and (10), we finally obtain 

df = ^dx + ^dy + ■ ■ ■ 
dx dy 


( 1 - 10 ) 


( 1 - 11 ) 


The difference between (7) and (11) should be noted: in the former equa- 
tion the partial derivatives are taken with respect to the independent va- 
riables, while in the latter, with respect to the dependent variables. The im- 
portant conclusion may thus be drawn that the total differential may be 
written either in the form (7) or (11); that is, df may be composed addi- 


df 

tively of terms — dx, 
dx 

independent variable. 


regardless of whether x is a dependent or an 
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1.4. Higher Order Differentials . — Differentials of the second, third and 
higher orders are defined by 

= •••: drf«d(dr^f) 

If there are two variables x and y, we obtain from (4) 


(fff = d{df) = d 
However, 




dx\dxj dy\dx/ dx^ dxdy 


with a similar expression for d 


(S' 


hence 


d^f = ^ {dx)^ + “ dxdy + ^ (dy)^ + <^y 


dx 


dxdy 


dy^ 


dx 


dy 


If X and y are independent variables, d^x = d^x = • ■ ■ <Px 
and the n-th order differential becomes 


dry = 0, 


d7 = — dx" + -- ^4~ dx’'~^dy + f- (”) - ■ / dx”~*d?/* 

^ ax" \\) dx^-^dy ^ V/c/ax"-*^-* 


‘ay* 




( 1 - 12 ) 


where the are the binomial coefficients, C)=C-\) = n\/k\(n — k)\ 


(Cf. sec. 12.2.) 

Example. Calculate dp and d^p for a gas obeying van der Waals’ 
equation: 

RT ^ 

P " y _ y2 

/ M R /^\ ^ _ RT 2a 

VaT/v ~ v-p’ \dv) r iv-p)^^v^ 

-n {^\ 6c 

VaTVv ~ ’ \aFVr (1^ - pf V* 

^ /^\ R ^ J_ /ffp\ 

dV KdT/ ~ (V - dY ~ ^T VaK/ 
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dp 


R 

(F-^) 


dT + 



RT 1 
(F - 


dV 



2RT 
iV - 



2R 
{V - 


dVdT 


1.6. Implicit Functions. — In the preceding discussion, the dependence 
of one variable on another has been given in explicit form, as x = J{y). 
Let us assume the relation between the variables to be given in implicit 
form such as/(x,i/) = 0. If it is now desired to compute dy/dx, one could 
solve fix^y) = 0 for y and then differentiate. This procedure, which is 
often needlessly complicated, may however be avoided, for, according 
to (4), 


df 



= 0 


(1-13) 


and 



If the equations for a circle, x^ + y^ — = 0, or an ellipse, 

x^/a^ + y^/b‘^ —1=0 are taken for f{x,y) = 0, the advantage of using 
this method to obtain derivatives is at once evident. 

If an implicit relation is given between three variables, F(x,y,z) = 0, 
any one may be considered to depend on the other two, for there are three 
possible relations 

^ =/( 2 /, 2 ); y = g{x,z); z = h{x,y) 


If X be taken as the dependent variable, then 

dF = Fxdx + Fydy + F^dz = 0 

At constant y, dy = 0, so that 

/ ^ 

\dz)y F, 


(1-14) 


(dx\ 


Fy 


at constant 2 , dz = 0, hence 
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A third possibility arises if two relations are given between three vari- 
ables 

= 0 

g{x,y,z) = 0 

Then 

df = fxdx + fydy + fzdz = 0 

dg = gxdx + g^/iy + gzdz = 0 
Solving these two equations, we obtain (see sea 10.9) 

fvfz fzfx fxfy 

dx \dy :dz = : : 

9v 9z Qz Qx gx Qv 

Further examples of the properties of implicit functions and their deriva- 
tives will be found in the discussion of thermodynamic quantities. 

1.6. Implicit Functions in Thermodynamics. — The simplest thermo- 
dynamic systems are homogeneous fluids or solids, subjected to no external 
stresses except a constant hydrostatic pressure. Investigation shows that 
for all such systems, there is an equation of state or characteristic equation of 
the form 

fip.V.T) = 0 (1-16) 

where p is the pressure exerted by the system, V is its volume and T, its 
temperature on some suitable scale. From (16), an equation of the form of 
(13) may then be obtained. 

df — (^f/^7^)v,T dp + {df/dV)jf TdV + {df/dT)py dT = 0 

Setting dp, dV, dT equal to zero, successively, there results a set of equations 
similar to (14) and (15) 


(^S\ ^ _ (y/dT),.^ ^ 1 

\dTj^ {df/dV),j {dT/dV)^ 

\dp)v ~ ” {df/dT)^,v {dp/dT)y 

/ dp\ __ _ {df/dV)p^T 1 

KdV/T ” (df/dp)T,v' (dV/dp)T 


Three possible products may be found by multiplying any pair of these 
equations and removing the common terms. A typical one is 


,dv)r \dT/p \dT)v 


( 1 - 18 ) 
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The product of all three derivatives is 



-1 


(1-19) 


These results are of considerable importance since they are verified by 
experiment, the derivatives being proportional to such physical quantities 
as the coefficients of compressibility, thermal expansion and temperature 
increase with pressure. 

1.7. Exact Differentials and Line Integrals. — It is often required, in 
thermodynamic problems, to find values of a function u{Xjy) at two points 
{xi,yi) and (a: 2 , 2 / 2 ) by integration of an equation 


du(x,y) = M{x,y)dx + N{x,y)dy (1~20) 


between the limits u\ and U 2 , 

The attempted integration results in such a symbol as j M{Xjy)dXy 


which is meaningless unless y can be eliminated by a relation, y == f(x). 
This is equivalent to specifying the path in the XF-plane along which the 
integration is performed, hence integrals of (20) are known as line integrah. 
There are many of these paths, the value of the definite integral differing 
in general, for each. The situation is particularly simple when du is a total 
differential, or, as it is often called, a complete or exact differential. Com- 
parison of (4) with (20) shows that in this case 

M{x,y) == du/dx; N{x,y) = du/dy (1^21) 


Moreover, since the order of differentiation is of no importance, it follows 
that 

dMfdy = d^rildxdy - dN/dx (1-22) 

Inspection of (21) shows that u may be found by integration even when a 
functional relation between x and y is unknown. In other words, the line 
integral is independent of the path; it depends only on the values of x and 
y at the upper and lower limits. The function n. is then said to he a pmnt 
function. 

In thermodynamics, it frequently happens that the upper and lower 
limits are the same, that is, the integration is performed around a complete 
cyde. If the differential du is exact, then the value of the line integral is 
zero; if dw is inexact, integration around a closed cycle gives a result not 
equal to zero. 

1.8. Exact and Inexact Differentials In Thermodynamics. — ^Examples 
of exact and inexact differentials are readily found in thermodsmamics. 
Consider a mole of an ideal gas, whose equation of state is pV « RT, I^t 
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the initial conditions be Vi, pi and Ti and the final conditions be 1 ^ 2 ) P 2 
and 7'2. Calculate the change in volume and the work done in going 



Fia. 1-1 


from the initial to the final state, the integration being along two different 
paths in each case. Since V = f(p,T), 


dV 



dp 


^-dT-~dp (1-23) 

P V 


Let the first equation of path {AC in Fig. 1) be 


T - 



(p - Pi) 


Ap 


(p - pi) 


Then dT 


AT 

Ap 


dp and (23) becomes 


Vat dp l„ AT \dp 

dF = ft — — - { - T“ Pi) ”1 

LAp p \ Ap / p* 


AT 

Ap pj 


or, on integration. 


F 2 - Vi 


AV 


r{T2P\ - P2ri) 


P 1 P 2 
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The second path will be considered as consisting of two parts : AB and BC 
(cf. Fig. 1). 

Along path AB, T = Ti, dT = 0 and along BC, p = pz, dp = 0, hence 

dp R 

dV = ~RTi-4 + -dT, 

V P2 


or 


AV = 


R(T 2 Pi - P2T1) 


P 1 P 2 


The change in volume is thus the same for these alternative paths. 

A similar conclusion might have been drawn from the test for exactness: 

M = R/p] N = -RTjp^ 

dM R dN 


dp 


dT 


which shows that (23) is exact. 

The mechanical work done by an expanding gas is 

dW = pdV 


(1-24) 


regardless of the shape of the container and provided that the expansion is 
performed reversibly^ in the thermodynamic sense. Combining (24) with 
(23) we obtain 


dW 


RT 

= RdT dp 

V 

It is clear that dW is inexact since 

RT dM 


(1-25) 


M =R; A = - 


P 


„ dN 
= 0 — 
dp dT 


R 

P 


By path AC, 


and, on integration, 

Wz-Wi = AWi = R (— Pi - Ti) In^ 
\Ap /Pi 


^ Here and elsewhere in this chapter, we assume that all processes are performed 
reversibly when such requirement is needed for the argument. For discussions of 
reversibility, texts on thermodynamics should be consulted. 
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Along paths AB and BC, 


or 


AW, 


= In^ + AT-J 


Comparison of AW i and AW 2 shows that the work is different along the 
two paths. 

Heat absorbed or evolved in a process, dQ, also depends on the path. 
The expression for the inexact differential with p and T as independent 
variables is 

= CpdT + Apdp (1-26) 

where Cp and Ap are the continuous functions of T and p, known as the heat 
capacity at constant pressure and the latent heat of change of pressure, 
respectively. 


Problem. Connect the points piW i and P 2 W 2 of Fig. 1 with a circular arc. Inte- 
grate (23) along this path. 


1.9. The Laws of Thermodynamics. — There are obvious advantages in 
expressing the laws of thermodynamics in terms of quantities which are 
independent of the path.^ As we have seen, both dQ and dW are inexact, 
but the difference between them, a function known as the internal energy 

dU = dQ — dW (1-27) 


is an exact differential. This equation^ often serves as a statement of the 
first law of thermodynamics. By combining (25) and (26) we may also 
write 


dU = 

fr 

^ dT] 

dT + 

Fa 

Ap — p — 

L dpj 

with the additional requirement of exactness from (22) 

d 


d 

r. an 

dp 

_Cp 

~ dT 

1 

< 

1 


(1-28) 


(1-29) 


^ This fact was recognized by Clausius, “ The Mechanical Theory of Heat,’^ trans- 
lated by W. R. Browne, Macmillan & Co., London, 1879, who discusses the laws of 
thermodynamics from this standpoint. 

^ Note that H-dQ means heat absorbed and +dW work done by the system. Minus 
signs indicate heat evolved or work done on the system. 
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These two equations are a more satisfactory definition of the first law than 
(27) since they show the essential fact that the internal energy, dU, is an 
exact differential: The inexactness of dQ and dW is sometimes indicated® 
by stating the first law in the form of (27) with symbols such as dQ, DQ, 
or 5Q on the right. 

The second law of thermodynamics is based upon an attempt to find a 
function of dQ which is an exact differential. From (27) and (24), 

dQ =dU + dW = dU + pdV (l-27a) 

but U = f{V,T), hence 


and 



(1-30) 


In passing from an initial state, Vi, Ti, to a final state, V 2 , T 2 , the integral 
on the right of (30) cannot be evaluated without further information, 
since the second term contains both p and V. In the special case of an 
ideal gas where pV = RT and {d'U)/{dV)T = 0, (30) becomes 


dQ 



dT + 


RTdV 

V 


(1-31) 


The first term on the right of this expression is the heat capacity at constant 
volume and depends on the temperature alone. If therefore we make the 
further restriction of constant temperature, that is, assume the process to 
be isothermal, the integral may be obtained. The form of (31) suggests that 
if we divide by T, the resulting equation 


T 


l(d 



T \dT 


dT + 


RdV 

V 


may also be integrated when T changes. The more general inexact differ- 
ential (26) when divided by T is also exact, the quantity S so defined being 
the entropy 


dS 


m 

T 


Cp 

T 


dT + ^dp 


(1-82) 


The condition for exactness 

— ( ^ ( ^p\ 

dp\T) “ dT \T ) 


(1-33) 


^ The question of a suitable notation for use in thermodynamics has been discussed 
by Tunell, G., J. Phys. Chem. 36, 1744 (1932); J, Chem. Phys. 9, 191 (1941). 
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together with (32) serve as basis for a statement of the second law. Our 
arguments concerning the first and second laws are intended only to show 
their property of exactness. The most satisfactory formulation of these 
laws is probably that of Carath6odory. We consider this subject in 
sec. 1.15. 

The functions dU and dS may be combined by using (24), (27) and 
(32), to give 


dU = TdS - pdV 

(1-34) 

Since f/=/(S,F) 

(1-35) 

and jdt/ is exact, we may also write 



(1-36) 

Comparison of (34) with (36) shows that 





The importance of (35) arises from the fact that if [/ is known as a function 
of two independent variables, S and F, it is possible to calculate numerical 
values of p, T and U for any thermodynamic state when S and V are given. 
A quantity like U thus furnishes more information than the equation of 
state, for the latter will only give p, V and T; in order to obtain U and S, 
the heat capacity as a function of temperature must also be given. It is not 
necessary to choose S and V as the independent variables in (35) or (36), in 
fact any pair of the set : p, F, T, S (or of the functions to be defined immedi- 
ately) may be taken, but the resulting exact differential is simpler when S 
and V are selected.® 

When the conditions of a specific problem suggest another pair of inde- 
pendent variables, it is more convenient to define additional thermodynamic 
functions. These are given in the following relations, where the symbol as 
used by Gibbs precedes the one now customary. 

The heat content or enthalpy , x ^ H = U + pV 

dH ^dU + pdV + Vdp - TdS + Vdp (1-37) 

Hhe work content ov Helmholtz free energy = A = U — TS 

dA ^dU - TdS ~ SdT ^ -SdT- pdV (1-38) 

® Gibbs preferred S and V as independent variables for reasons given in loc. cit., 
footnote on page 34. 
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The free energy or Gibhs thermodynamic potential, 

^ = F ^ U - TS + pV 
dF =dU - TdS - SdT + pdV + Vdp 

= - SdT + Vdp (1~39) 


As in the case of dU, any pair of the set: p, V, T, S, U, H, A, F may 
be chosen as independent variable, but the exact differential is simpler when 
expressed in terms of the functions shown in the last equation of (37), 
(38) or (39). Since most experimental work is done at constant pressure 
rather that at constant volume, it is obvious that H and F (where the 
pressure is one of the independent variables) are more generally useful 
than U and A . The whole of the thermodynamics of systems of constant 
composition may be developed, however, using any one of the following sets 
of variables: (1) C7, S, 7; (2) H, S, p; (3) A, T, V; (4) F, T, p. 

It is frequently necessary to have some means of predicting the direction 
in which a system spontaneously approaches a state of thermodynamic 
equilibrium. Let us consider two bodies, one at a temperature Ti and the 
other at a lower temperature T 2 . Then if the whole system is surrounded 
by adiabatic walls so that no heat enters it, we may write 


dSi * - 


Ti ^ 



where dQ is the heat absorbed by the colder body. The total entropy of the 
system thus increases, for 


dS = dSi -f* dS2 = dQ 


(Ti - 7 - 2 ) 
T1T2 


> 0 


Clearly dS .= 0 when thermal equilibrium is reached. From (39), we also 
see that at constant temperature and pressure, dF = 0 when equilibrium 
is established. Since the entropy reaches a maximum, the free energy 
simultaneously reaches a minimum. In Table 1, we collect the criteria 


TABLE 1. DEPENDENT VARIABLE BECOMES A MAXIMUM 

Independent Variables Fixed Dependent Variable 


T^P F 

T,V A 

S^P H 

S,V 1/ 

C/ e V 

AT V 

A, V or F,p T 


DEPENDENT VARIABLE BECOMES A MINIMUM 

t U,V or H,p S 

F, T or Hf S p 
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for spontaneous approach to equilibrium when various pairs of the inde- 
pendent variables are held constant. 


Problem a. 

Am. 


Problem b. 


Find expressions for F, A, U in terms of set (4). 
S = -dF/dT; F = F - TdF/dT; 

V = dF/dp) A = F - pdF/dp; 

^ dF dF 

dT ^ dp 
Verify the following ( 




\dvJs \dsJv ^ KdV/T \dT/v 
\dpJs ~ W/p ^ ydp/T “ KdT/p 


1.10. Systematic Derivation of Partial Thermodynamic Derivatives. — 

With the addition of Q and IF, we have ten important thermodynamic 
quantities. The heat capacities are not included in the list, since by their 
definitions: Cp = (dQ/dT)p, Cv= {^Q/^T)vy they may be readily deter- 
mined from the set of ten just mentioned. We now wish to describe meth- 
ods of obtaining all first order partial derivatives of the form {dx/dy)z where 
X, y and z are any members of the set. It is immediately apparent that 
there are a large number of them for there are ten ways of choosing x, 
leaving nine and eight ways, respectively, of choosing y and 2 , a total of 
720 first derivatives. When all possible relations between the first deriva- 
tives are included, the total number of equations is increased enormously 
for, in general, a selected derivative may be written in terms of three other 
derivatives which are independent of each other as the following considera- 
tions show. Suppose x = fiyyW)^ then 

* - (S) * + 

and 

/ dx\ / dx\ / dx\ / dw\ 

\dy)z \dy)yy \dw)y\dy)z 


There are, of course, many cases where there are relations between fewer 
than four derivatives but neglecting these, the total number of equations 
obtainable is the number of combinations of 720 derivatives taken four at 
a time, 7201/4! 716! or approximately 10^^. Although many of the rela- 
tions are of little use, it is convenient to devise a systematic method for 
obtaining any of them. 

The best known of these methods is that of Bridgman^ which is simple 

^ Bridgman, P. W., Condensed Collection of Thermodynamic Formulas,"' 
Harvard University Press, Cambridge, Mass., 1926. 
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and often used. It will be described only briefly since it is a special case 
of a more general procedure which we give in sec. 1.13. It is unnecessary 
to compute the 10' ° relations because any one of them could be obtained ii' 
the 720 first derivatives were tabulated in terms of the same set of three 
independent derivatives. The particular choice of the three is arbitrary, 
Bridgman having taken 



because these are directly obtainable by experiment. One could then pick 
any four derivatives, write them in terms of the chosen three and eliminate 
the three derivatives from the four equations. The result would be a single 
equation containing the four derivatives. 

The 720 derivatives could then be elassified into ten groups by holding 
one quantity constant and varying the other nine. Within the group 
containing derivatives at constant «, 



which follows by writing according to (11) 

\dw/ \dz/ 

(1-41) 

* - (£)'“’ + ( 2 )* 


setting da = 0 and dividing one equation by the other. It should be 
remembered that even if x and y are not functions of w and a it is still 
possible to have inexact dififerwitials of the form of (41), hence the present 
arguments apply to dQ and dW as well as to the remaining eight thermo- 
dynamic functions. Upon adopting the abbreviations 



any derivative at constant a may be written in purely formal fashion by 
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taking the ratio of the proper pair, or 

_ {Bx), 

\^v/t " {&v)z 

The task of computing the 72 derivatives in this group is thus reduced to 
calculation of the nine quantities (dx)*, (dy)«, • • *. The latter are easily 
found-when several of the derivatives {dx/dy)g are known in terms of the 
fundamental three for it proves possible to split the former into numerator 
and denominator by inspection. 

If each of the remaining groups were treated in a similar way, 90 expres- 
sions of the form (dx)^, {dy)z, (dx)y, • • • would be obtained but in every 
case (dx)y == — (d?/)x so that the final list need contain only 45 relations; 
they are given by Bridgman (loc. cit.) in convenient tables.® The follow- 
ing examples show their use. Let it be required to calculate (dT/dp)^. 
From the tables, (dT)H = V - T(dV/dT)p, (dp)H = -Cp,thu8 



Many alternative forms are easily found, for example, 

(dT/dS), = r/Cp; {dT/dp)s = ; idS/dp)B = -V/T 

hence, 



Additional examples, tables for a few of the second derivatives, and exten- 
sion of the method to include mechanical variables other than pressure 
have also been given by Bridgman. 

A further amplification of the method has been presented by Goranson^ 
whose tables include the following cases: (1) one-component unit mass 
systems (constant total mass) ; (2) one-component variable mass systems 
or two-component unit mass systems; (3) two-component variable mass 
systems or three-component unit mass systems; (4) three-component 
variable mass systems or four-component unit mass systems. Lerman^® 
has shown how the construction of such tables may be simplified. 

1.11. Thermodynamic Derivatives by Method of Jacobians. — A more 
general method which is based on the properties of functional determinants 

® Abbreviated tables may be found in several places, for example, Slater, Introduc- 
tion to Chemical Physics,” McGraw-Hill Book Co., New York, 1939. 

® Goranson, Roy W., “ Thermodynamic Relations in Multi-component Systems,” 
Carnegie Institution of Washington, Washington, D. C., 1930. 

Lerman, /. Che?n. Phys. 6, 792 (1937). 
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or Jacobians has been described by Shaw.^^ The mathematical basis on 
which it is founded will be discussed in detail in order to explain the con- 
struction of the required table and its application to specific examples. 

1.12. Properties of the Jacobian. — The Jacobian}^ of x and y with 
respect to two independent variables, u and v, is defined by 

J(x,y/u,v) = d{x,y)/d{u,v) = 

/ dx\ / 3x\ 

\du)„ \dv)u 

/M /dj\ 

\du/ V \dv / u 

When the independent variables are discernible from the context, the 
Jacobian may be abbreviated as J {x,y), the second form of (42) being 
reserved for cases where it is necessary to give the independent variables 
explicitly. The following properties are obtained directly from the defi- 
nition of the Jacobian: 

J{u,v) = —J(v,u) = 1; 

J(x,x) = 0; J{k,x) =0; k, any constant (1-43) 

J{x,y) = J(y,-x) = J{-y,x) = -J(.y,x) 

A further important property of the Jacobian arises if x and y are explicit 
functions of z and Wj which in turn are explicit functions of u and v. Writ- 
ing d(x,y)/d ( 2 , it;) and d{Zjw)/d{UjV) in determinant form, using the rule for 
the multiplication of determinants, the abbreviations and 

so on, we have 


Xz Xy) 


Zy Zy 


XtZu + Xv,Wu 

XiZv + Xv,Wv 


X 


= 



Vm Vw 


Wu Wy 


ytZu + ywWu 

y^v + VviWv 


\du/9 \dv/^ \dv/u 


(1-42) 


A typical element of the product 


d" Xy)W\ 


^ /^\ /^\ _ /^\ 
^ \dz)w \dw)z \du)v \du)v 


“Shaw, A. N., Phil. Tram. Roy. Soc. (London) A234, 299-328 (1935). 

The properties of determinants, which are used here, are discussed in Chapter 10. 
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the last form resulting from (8), whence 


^(x,y) ^ d(z,w) 
d(z,w) d(u,v) 


In the important special case, y = v, 

dix,y) ^ 
d(u,y) 

yu yy 


^{^,y) 

d{u,v) 


yu yv 


\duL 


(1-44) 


(1-45) 


= (^) =1 and ». = (f) .0 

\dy/x \du/y 


Since many thermodynamic functions are of the form f(x,y,z) = 0, where 
any one variable is determined by the other two, we may write from (4), 


or using (46) 


\dx/y \dy/x 


j d{z,y) , d(z,x) j 
dz = — — r dx + — — - dy 
d{x,y) d{y,x) 


Expressing each of these variables in terms of two new independent vari- 
ables, rand s, and using the abbreviations J{z,y) = d{z,y)/d(r,s), etc., 
(44) enables us to write 


If we multiply by J (x,y), 

J {z,y)dx J {x,z)dy -f J (y,x)dz = 0 


( 1 ^ 6 ) 


since J{x,y) = —J{y,x), etc., from (43). If two more variables, u and v, 
are related to r and s in the same way, (46) may be divided by du at con- 
stant V, giving 

O. + S). + (£). - “ 
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So that finally, again because of (45) 

J (Zyy)J (XyV) + J (x,z)J (y,v) + J (y,x)J (z,v) » 0 (1H17) 

Problem. If r, a axe functions of x, y, z and the latter in turn are fimctions of the 
independent variables u, v show that 

J{r,8/u,v) « J{r,8/x,y)J(x,y/u,v) J{r,8/y,z)J{y,z/u,v) + J{r,8/ZyX)J(z,x/u,v). 

1.13. Application to Thermodynamics. — This last equation is the 
important one which determines all of the thermodynamic partial deriva- 
tives, for if two independent variables, r and s, are chosen which com- 
pletely determine the others, x, i/, 2 , v, then any one Jacobian, for example 
J{Xjy)j is given in terms of five others. But if r and s are taken from the 
set X, y, Zj Vj then J {x^y) is given in terms of only four others, since by 
(47) J(r,s) = d{r,s)/d{r,s) = 1. 

Let us choose p, V, T and S for x, y, z and v, respectively, so that 

J{T,V)Jip,S) + Jip,T)JiV,S) + JiV,p)J{T,S) = 0 (1^8) 

One more reduction is possible since from (34), 

idU/dV)s = -p; (dU/aS)v = T 

and 

(dW/dSdV) = idT/dV)s = - {dp/dS)v 
In Jacobian notation, 

J(T,S)/JiV,S) = -J{p,V)/J{S,V) 

Finally since J{V,S) = —J{S,V) from (43), we obtain 

J{T,S) = Jip,V) 

When the following abbreviations 

a = JiV,T) 
b = J{p,V) = JiT,S) 

c = J{p,S) (1-49) 

I = J{p,T) 

n = JiV,S) 

are substituted into (48) and (43) is used to change the signs, we have 

6^ + ac - ni = 0 (1-50) 

It is convenient to list the various Jacobians in rows and columns, 
J {x,y) occurring at the intersection of row x with column y. The upper 
left-hand block of such a table is immediately filled by using the definitions 
(49), the rule for the change of signs, and the fact that J(x,x) = 0 from 
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(43). The entries for the lower left-hand comer of the table are obtained 
by writing the definitions of dC/, dH, etc., in Jacobian form. For example, 
since 

dU = TdS - pdV 
J(U, 2 ) = TJ{S,z) - vJ{V,z) 

where z is any required variable. Hence, if z is taken as p and then as V 
J{U,p) = rj(S,p) - pJ{V,p) = -Tc + p6 
J{U,V) = TJ(SJ) - pJiV,V) -Tn 

the last forms following from the part of the table which is already filled 
or from the definitions in (49). The upper right-hand comer may be filled 
at the same time, without further calculation, by changing all signs. The 
table is completed by using relations already found, as for example 

J{A,H) = -J{H,A) = -SJ{T,H) - pJ(V,H) 

= -S{Tb - VI) - p(Tn - Vb) 

= --TiSb + pn) + V{Sl + pb) 

The final result is shown in Table 2. The use of it is typified by the 
following examples. 

Example 1. Evaluate {dF/dT)y in terms of other partial derivatives with 
T and V as independent variables. In Jacobian notation and from Table 2 

(dF/dT)v = J{F,V)/J{T,V) = - — - = -S - Vb/a 

a 

But 

b/a = J{p,V)/J(V,T) = -J(p,V)/JiT,V) = -(.dp/dT)v\ 

hence, 

{dF/dT)v = -5+ V{dp/dT)v 

Example 2. Transform the result of the preceding example into deriva- 
tives with p and S as independent variables. If the previous result is used, 
the term a causes trouble, since with p and S as independent variables, we 
obtain a *= J(7,r^ = d(7,r)/5(p,jS), a relation which cannot be reduced 
to a single derivative. In general, as we have shown, any partial derivative 
may be expressed in terms of not more than three other derivatives of 
thermodynamic functions. We therefore use (50), which gives a »= 
{nl ~ b^)/c, or, 


(dF/dT)v = -S - Vbc/inf - 6*) 
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But 

b = J{p,V) = dip,V)/d{S,v) = -(dF/dS)p 
c = J{p,S) = d{p,S)/diS,p) = -1 
I = J{p,T) = d{p,T)/d(S,p) = -idT/dS)j, 
n = J(y,s) = d(y,S)/dis,p) = -(aF/ap)s 

hence, 

{dF/dT)Y = -s - ^[(ar/as)p(dy/ap)s - (aF/as)J 

This procedure may be repeated using other quantities, such as T and 
Sy V and p, and so on, as independent variables. The difficulty in choosing 
the proper form of the original relation may usually be removed in the 
following way. Referring to the definitions of a, 6, c, I and rq it is seen 
that each can be reduced to unity by a proper choice of the independent 
variables. For example, if the latter are chosen as V and T, a = 1, since 
a == J{V,T). In the previous case, c = — 1, and it was found advisable 
to use some quantity other than a. The situation may be summed up in 
the following directions. In case one of the letters in the top line of the set 
a c I 

7 equals unity, do not use the one directly beneath it but trans- 

c a n 6 J 

form to another by means of (50). In this way, the resulting expression 
will usually contain only three different partial derivatives. The omission 
of b from the above list arises from the fact that even if 6 = 1, only single 
derivatives will occur. 

Example 3. Solve for {dp/dT)y in terms of C^, Cp and p = {dT/dp)ffj 
the Joule-Thornson coefficient Problems of this sort frequently arise where 
it is desired to expi’css a partial thermodynamic derivative in terms of other 
quantities, which are measured directly. The usual process of obtaining 
the relationship is tedious and complex. From the table, it is found that 

Cv — (^Q/dT)Y = Ttifa 

Cp = (dQ/dT)p = Tc/l 

p = {dT/dp)H = {Tb - Vl)/Tc 

{dp/dT)v = —b/a 

Since there are three relations given and only two letters in the last deriva- 
tive, it is convenient to write this in the form 

{dp/dT)v = —b^/ab 

and to solve for a, b and in terms of Cp and p. Using (50) to obtain 
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a relation between Cp and b^, we have 
Cp - Tinl - b^)/al 

a = Tn/Cr; b^ *= HCv - Cp)/T; b - ZOiCp + V)/T 

and finally 

{dp/dT)v = {Cp - Cv)/{Cpp + V) 

Example 4. Determine {dU /dV)f for a gas obeying (i) the ideal gas 
law, pV = RT] (n) van der Waals’ equation, (p + a/V^){V — ff) = RT. 
In problems of this sort, the resulting formulas usually contain no more 
than one partial derivative instead of three as in the earlier cases. From 
Table 2, 

\dV)T a ^ 


If p and V are taken as independent variables, 



In Shawls paper (loc. cit.), auxiliary tables are given to simplify the calcu- 
lations for the following cases: the ideal and van der Waals^ gas, the 
saturated vapor, black-body radiation. 

The Jacobian method has been extended by Shaw to include second 
derivatives and to apply to systems of variable composition. For these 
applications, as well as more detail on the use of the tables, the original 
paper should be consulted. 


Problem. Prove the following relations: 



1.14. Thermodynamic Systems of Variable Mass. — ^The development 
of thermodynamics up to the time of Gibbs may be briefly summarized by 
the equation of Clausius (34) which combined the two laws. The subject 

^*The Jacobian method has also been described and illustrated with numerous 
examples by Sherwood, T. K. and Reed, C. E., Applied Mathematics in Chemical 
Engineering,” McGraw-Hill Book Co., New York, 1939. 
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was thus confined to systems of constant total mass. Gibbs showed how 
this equation could be extended to include systems of variable mass.^* If 
we consider a system composed of several subst$.nces whose masses are 
mi, m 2 , • • • we may change the internal energy not only by varying the 
entropy and the volume but also by varying the relative masses. Thus in 
place of (35) we have 

U = U{S,V,mi,m2,- • •,mn) 


and in place of (36) 




, j 

+ 1 - 1 dmi -f- 1 - — 1 dm 2 + • • • 

\dmi/s.V.m,-- Vwi2/s.F.m„... 

(1-51) 

If we write 


(f) - 

(1-52) 

we have 


dU = TdS — pdV + tixdmi + + * * * 

(1-53) 


If dU is eUminated from (53) by using in turn equations (37), (38) and 
(39) we obtain 

^ /^\ ^ /M\ ^ 

\dmi/s,p,m,.m,. - \dmi/v,T.m,.m,.-- 

The partial derivatives defined by any of these equivalent expressions were 
called by Gibbs the chemical potentials. We may also convert (53) into 
the equation 

dF = — SdT -(■ Vdp fiidmi -b H2dm2 ”!■ ■ ■■ (1—55) 

At constant temperature and pressure and for a reversible process, as we 
have shown, dF = 0; hence according to (55) the condition for equilibrium 
reads 

dF = mdmi + M2dwi2 + • • • =0 (1-56) 

From this equation we may derive the celebrated phase rule of Gibbs. 
Let us understand by phase a homogeneous part of a system separated from 
the rest of the system by recognizable boundaries. Thus a mixture of ice, 
liquid water, and steam is a system of three phases. The number of 

His results also included other variables such as electric, magnetic, and gravita- 
tional fields as well as surface phenomena 


) 




(1-54) 
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components is the least number of independently variable constituents 
required to express the composition of each phase. In our previous exam- 
ple there is only one component. In a system composed of an aqueous 
solution of sugar there are two components for it is necessary to specify 
the amounts of both water and sugar present. Finally we need a definition 
of degree of freedom. It is the number of variables (such as temperature, 
pressure, composition of the components) which is required to describe 
completely the system at equilibrium. For example, liquid water in the 
presence of water vapor is a system of one degree of freedom, for we may 
vary either the temperature or the pressure but we cannot change both 
simultaneously for then either the liquid or the vapor disappears. 

Suppose a system contains C components and P phases, then an equa- 
tion of the form of (55) will hold for each phase. Since F like S and V is an 
extensive variable, it follows from (55) that the chemical potentials must be 
independent of the masses, so that we may integrate (56) term by term 
obtaining 

F = Mi^i + + • * • + (1-57) 

Differentiation of this equation results in 

dF = fjLidmi + fjL2dm2 + • • • + ficdmc 

-|- midfii “b m2djjL2 -j- • • • -[- m(;;dixc 
When it is subtracted from (56) we get 

mid}xi + m2dyL2 == 0 (1-58) 

Equilibrium can be established only when an equation of this form holds 
for each of the P phases. But there are C + 2 variables T, p, jui, /X 2 , • • *, 
jxc, hence the number of degrees of freedom / is 

/ = C + 2 ^ P (1-59) 

This simple equation has been of inestimable value in the study and inter- 
pretation of heterogeneous equilibrium by the chemist, physicist and 
metallurgist.^^ 

1.16. The Principle of Caratheodory. — In most textbooks of thermo- 
dynamics, the order of presentation parallels the historical development 
of the subject. For this reason, considerable attention is paid to several 
kinds of ideal or imaginary machines. The customary procedure is to 
cite, first of all, the impossibility of constructing perpetual motion machines 
of various types; when this is granted it is possible to state the conditions 

Such applications are discussed by Findlay, A., ** The Phase Rule and Its Appli- 
cation,'’ Eighth Edition, Longmans, Green and Co., 1938; Desch, '' Metallography,” 
Longmans, Green and Co. 
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under which real machines may operate and to derive the whole body of 
positive assertions which are incorporated into the science of thermo- 
dynamics. The critical student may feel the need of a more logical and 
formal approach, and this will now be given. 

We have attempted to emphasize in sec. 1.9 one important mathe- 
matical consequence of the laws of thermodynamics, namely, that func- 
tions such as dt/ and dS are exact differentials. We now wish to discuss a 
more fundamental mathematical property of these laws which was dis- 
covered by Carathcodory. His arguments^® are derived from the geometric 
behavior of a certain differential equation and its solution. As a result, he 
is able to obtain in a purely formal way the laws of thermodynamics with- 
out recourse to fictitious machines or such objectionable concepts as the 
flow of heat. We cannot reproduce here the complete theory^^ but shall 
only give the mathematical details of his treatment of the second law. 

Let us assume that a thermodynamic systenrx is composed of n separate 
parts, each one of which is characterized by its pressure and volume. Fur- 
ther, suppose that the whole system is surrounded by adiabatic walls or 
thermal insulators while the individual parts of the system are separated 
from each other by walls that are perfect conductors of heat. As a result 
of experiment, it is found that there is no observable change in the system 
(i.e., equilibrium has been reached) when the following conditions are met: 

fliPuVl) =/2(P2.F2) = • • • =/n(Pn,Fn) = (1-60) 

The relation /t(p,,l\) = F(t?) for the f-th part of the system is, of course, 
an equation of state, and d is the temperature of the whole system on some 
suitable empirical scale. According to the first law (see eq. 27a) 

dQ = dU + pdV = 0 (1-61) 

the whole system being adiabatic. Moreover, a similar equation holds for 
each part of the system : 

dQi = dUi + p4Vi 

and 

dU^idUi\ dQ=^idQi 

As we have shown, dQi is not an exact differential. However, it de- 
pends on only two variables, and under these conditions an infinite number 

Carath6odory, C., Math. Ann. 67, 355 (1909). 

Carath^dory’s theory has been reviewed by Born, M., Physik. Z. 22 , 218, 249, 
282 (1922) and by Land6, A., “ Handbuch der Physik,” Vol. IX, Chapter 4, J. Springer, 
BerUn, 1926. 


(1-62) 

(1-63) 
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of integratiog denominators exist.** Hence eq. (62) may be converted into 
an exact differential. Let an integrating denominator be U, so that 

d<t>i - dQi/ti (1-64) 

is exact. Clearly <t>i is then a function of the state of the system, hence we 
may change (61) in such a manner that the independent variables are 
and 4>i instead of U and F. The result of this transformation is 




The quantity dQ is not exact, nor is it to be taken for granted that it can 
be made exact by the use of an integrating denominator if dQ contains 
more than two variables. As a matter of fact, the procedure is possible 
only when the differential equation dQ = 0 (known as a Pfaff equation) 
possesses a solution, as we shall show in sec. 2.18. In that case (and we 
shall here be interested in no other), there is an integrating denominator t 
such that 

d<t> = dQ/t (1-66) 


is exact, even when there are n variables. More important for our present 
needs is the conclusion drawn from simple geometric considerations that if 
there is an integrating denominator, then there are in the neighborhood of 
any point P many other points which are not accessible from P along the 
path dQ = 0. This formal mathematical consequence of the properties 
of the Pfaff equation is known as the principle of Carath^odory. It is 
exactly what we need for thermodynamics. Consider, for example, a gas 
at a given pressure, pi and volume, Fi. We may expand or compress this 
gas adiabatically (i.e., along the path dQ =0), but the final state of the 
system will be characterized by variables p 2 , F 2 wliich \\g cannot choose at 
will. There are many values of p and V which we are not able to reahze 
adiabatically. 

We refer the reader again to sec. 2.18 for the conditions under which 
equations like (65) have a solution, hence an integrating denominator. 
We proceed here with the physical results which may be obtained when we 
know that the integrating denominator exists. In order to simplify the 
situation let us assume that the thermodynamic system is composed of 
only two parts. This restriction does not mean that there is any loss in 
generality of the final results since all our arguments could easily be 
extended to cover a system of any number of parts. With n = 2, it follows 

The proof of this fact as well as other mathematical conclusions reached here 
are given in sec. 2.18. Except for the proofs, the present section is complete in itself. 
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from (63), (64) and (66) that 

= tid<l>i + t2d<t>2 (1-67) 

If we take as in (65), <l>i, 4>2 and as independent variables we see that 
d<t> ti d<t> <2 


d<t>i t d<j>2 t 


dd 


= 0 


( 1 - 68 ) 


The last equation of (68) shows that <t> depends on <t>i and 02 but not on 
so that according to the other two equations of (68), the ratios ti/t and 
t 2 /t are also independent of t?: 




This result may be written: 


ti dd t2 dd t dd 


(1-69) 


Now ti is a function of the state of the first member of the system and there- 
fore could depend only on 0i and dy while t 2 cguld depend only on 02 and d. 
However, the first equality in (69) indicates that ti and ^2 must actually be 
functions of d alone, and we may write 


d In 
dd 


d In (2 
dd 


d\xi t 
dd 




(1-70) 


where g{d) is a function which is common to all systems in thermal contact, 
not dependent on any special properties of the substances which compose 
the system. Integrating (70), we obtain 

In t = J"g{d)dd + \n A{<i>) (1-71) 

where the integration constant In A depends only on the quantity 0. 
Note that we have dropped the subscripts from t and 0 so that eq. (71) 
refers to any thermodynamic system and t is the appropriate integrating 
denominator for the particular system under consideration. We see from 
(71) the important fact that this denominator can be separated into two 
parts, one depending only on the empirical temperature d and the other 
only on variables of the state of the system such as 0 whose differential is 
exact. 

Let us rewrite (71) in the form 


t = Ae 


•'o 


(1-72) 
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and define the absolute temperature T by the relation 

T{d) = Ce (1-73) 

The constant C relating t? and T may be determined by requiring that 
between two fixed points, say the boiling point and freezing point of water, 
T shall increase by 100 units. It should be noticed that there is no additive 
constant in (73), so that if C is positive, the smallest value of T is zero, and 
there is no upper limit for T. 

If our thermodynamic system contains only one part, we may use (72), 
(73) and (66) to write 

(1-74) 


Also, if we put 



+ const. 


(1-75) 


we obtain the well-known expression for the second law of thermodynamics 
which defines a change in entropy, dS: 

dQ = TdS (1-76) 

The entropy is immediately seen to be a function of the state of the system, 
constant along an adiabatic path {dQ =0). It is determined except for an 
additive constant. We also note from (76) that the absolute temperature 
is an integrating denominator of the inexact differential dQ. 

When the system is made up of two parts which are in thermal contact, 
eqs. (67) and (74) may be combined to give 

Ad<t> = Aid(t>i + A2d(f>2 (1-77) 

We know that is a function of </>! and that A 2 is a function of 02 . We 
want to prove that A is a function of <j> which in turn depends on 0i and 
02* Let us assume that A = A {<i>). Then 

dA ^ dA d<t> dA ^ dA d0 
d<t>i d0 d<j)i ^ d02 d0 d<f)2 

If we eliminate dA/d(t> from these two equations we obtain 

dA d(j> dA d0 ^ 

d<f>i d02 d02 

This result is often written in the Jacobian notation of sec. 1.12 

J(A,<l>/<l>i,(t>2) =■ 0 


(1-78) 
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It tells that if A is a function of 0, = 0 and conversely if 

J (A,</)) = 0, then A is a function of (j>. We can easily provein our case 
that the Jacobian does vanish. Differentiation of (77) results in 


dA d(f) 

d<j>i d<t>2 


+ A 


,d<t) 

A — = Ai, 

^ 

d<t>i 

d<t>2 

0- 

dA d(t> 

d<l>id<t)2 ^ 

d(t>2 d(t>i 


d<t>2d<t>x 


= 0 


hence by subtraction we obtain (78). Thus A is a function of <^. Under 
these conditions we have an equation similar to (76) for each part of the 
thermodynamic system, and since dQ = ^dQiy we finally conclude from 
(75) and (77) that dS = ZdSi. 


This result which may be applied in the case of n variables is often useful. If the 
n functions 7j\, ^ 2 , * * * » Vn are not independent of each other the Jacobian vanishes; if 
/ = 0, then the n functions are related by some equation 2 / 2 , * * * » yn) =0. 



CHAPTER 2 

ORDINARY DIFFERENTIAL EQUATIONS 


2.1. Preliminaries. — The customary classification distinguishes two 
main types: ordinary and partial differential equations. The former 
contain only one independent variable and, as a consequence, total deriva- 
tives. They represent a relation between the primitive of the dependent 
variable (y), its various derivatives, and functions of the independent 
variable (x). Partial differential equations, whose study will be reserved 
for Chapter 7, contain several independent variables and hence partial 
derivatives. Concerning terminology, the following is to be noted in 
connection with ordinary differential equations. 

The order of a differential equation is the order of its highest derivative; 
its degree is the degree (or power) of the derivative of highest order after 
the equation has been rationalized, i.e., after fractional powers of all 
derivatives have been removed. Thus the equation 


dx^ \dx) 


+ xy 


0 


is of the second order and the first degree, while 


dx^ y dx 


+ xy = 0 


is of the second order and the second degree. If the dependent variable 
and all its derivatives occur in the first degree and not multiplying each 
other, the equation is said to be linear. The solution of an equation of 
n-th order involves, in principle, the carrying out of n quadratures or inte- 
grations. Since each of them introduces one arbitrary constant, the final 
expression for the dependent variable will contam n arbitrary constants. 
However, a solution in which one or more of these constants are given 
specific values, for instance the value zero, will also satisfy the differential 
equation. In view of this consideration two types of solutions of an ordi- 
nary differential equation of n-th order may be distinguished: (1) the 
complete or general solution which contains its full complement of n inde- 

32 
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pendent^ arbitrary constants; (2) particular solutions, obtainable from 
the general one by fixing one or more of the constants. In addition to 
these, differential equations of degree higher than the first frequently possess 
solutions, known as singular ones, which cannot be formed from the general 
solution in this manner. An example of these will be discussed briefly in 
sec. 2.6; they are rarely of interest in physical or chemical applications. 


FIRST ORDER EQUATIONS 


An equation of the first order can always be solved although the solu- 
tion may sometimes not be expressible in terms of familiar or named 
functions. Methods of solution applicable in the most frequently occurring 
cases will now be given, and the discussion of each method will be followed 
by a list of problems, arising in physics and chemistry, which lead to 
differential equations solvable by the scheme in question. 

2.2. The Variables are Separable. — ^This is true when the equation, 

dy 

which may originally appear in the form fiix^y) — +/ 2 (^, 2 /) « 0, is re- 


ducible to 


f{x)dx 4- g{y)dy = 0 


Such an equation can be integrated at once and leads to a relation between 
y and x. 


Examples. 

a. Organic growth; radioactive decay. 

Bacterial cultures in an unlimited nutritive medium grow at a time rate 
proportional to the number of bacteria present at any moment. Hence if 
the time t is regarded as independent variable and V, the number of bacteria 
present at time t as dependent variable, 


a being the rate of growth per bacterium. This may be written 


dN 

N 


= adt 


^ Arbitrary constants are said to be independent if two or more of them cannot be 
replaced by an equivalent single one. Thus the constants Ci and C 2 in the functions: 
ax + ci + C2 and are not independent because these functions may be written 

ox + c and ce* respectively. 

This distinction is elementary. A more adequate analysis would focus attention 
upon independent solutions of the differential equation rather than independent con- 
stants. Solutions are independent when the so-called Wrmiskian determinant fails to 
vanish. This matter is treated in sec. 3.13. 
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which, on integration, yields In iV = + c, or = Ce^K If the original 

number of bacteria at ^ = 0 is iVo, the constant C must have the value Nq 
to conform to this physical condition. 

Radioactive atoms decay at a rate proportional to the number of atoms, 
Nj present at any moment, t. Hence dN /dt = — XA^, which has the solu- 
tion N = The disintegration constant X measures the time rate 

of decay per atom. It is a fundamental quantity characteristic of each 
radioactive substance. 


b. Flow of water from an orifice. 

A vertical tank of uniform cross-section A is filled with water to an initial 
height ho. Water flows out through a hole of area a. It is desired to find 
the height of the water, /i, in the tank as a function of the time, t. The 
volume flowing out in time dt is avdt^ where v is the velocity of the water 
at the orifice at time t. The loss of height in the tank is d/i, hence the loss 
of volume Adh. Therefore 


avdt = —Adh 


But the velocity is related to the height by Torricelli’s formula: v = c\^2gh. 
The empirical constant c would be unity if there were no obstruction and no 
vena contracta ’’ near the orifice; for ordinary small holes with sharp 
edges it is 0.6. Thus 

ac\^2ghdt = —Adh 


or 


Vh 



On integrating this we have 

2 A 

where the constant of integration has been so adjusted that h = ho Ri 
t = 0. 


c. Heat flow. 

When heat flows through a body the temperature, T, is in general a compli- 
cated function of the coordinates within the body. In simple cases, how- 
ever, it may depend only on a single coordinate, x (distance from a heated 
plane, or distance from a point source of heat). In that case, the rate at 
which heat crosses an area A perpendicular to x is given by 

R = -kA ^ 

dx 


( 2 - 1 ) 
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and R is constant because of the continuity of flow. The quantity k is 
known as the thermal conductivity. 

(a) If the body is a slab with plane parallel faces, one of which is 
maintained at a temperature Ti, integration of ( 1 ) leads to 

^ ^ Rx 


X being the distance from the heated face. From this one obtains the 
elementary relation 


R 



( 2 - 2 ) 


for the heat transfer across a plate of thickness d. 

(P) If a heat source is placed at the center of a sphere, the temperature 
is a function of r alone. Here A = 47 rr^, and ( 1 ) reads — ^Tkr^ (dT / dr) = /?, 
which gives 


7^ = + C 


In this case, the temperature is not a linear function of the distance from 
the source as it was in (a). 

( 7 ) At constant external temperature the thickness of ice on quiescent 
water increases as the square root of the time. To show this we write ( 2 ) 
in the form ♦ 


R = 


dt 


kA 


X 


where x now represents the thickness of ice and dH the quantity of heat 
transported away from the lower surface of the ice in time dt. This, how- 
ever, is proportional to the thickness dx which is added on to the already 
existing layer in time dt. Hence dx/dt = Cfxj C representing a constant. 
From this it follows by integration that 

x^ ^ t 


d. Salt dissolving in water. 

When xo grams of salt are placed in M grams of water at time t = 0, how 
many grams will remain undissolved at time tl The rate of solution, 
dx/dty is proportional, (a) to the number of grams, x, undissolved at time 
(b) to the difference between the saturation concentration, X/My and the 
actual concentration, (xq — x)/M. {X is the number of grams of salt 
that would produce saturation.) Thus 


(2-3) 
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To solve, we write 

dx 

(X — Xo)x + 3^ 




X-xo\: 


-X" — a^o + 


Integration then leads to: In ^ — — ■ ^ + c = — — kt. When 

X M 

the constant c is adjusted so that x = Xq ht t = 0, the result is 

(X - xo + x)xo X - xo . 

In = kt 

xX M 

If Xq = X, then the solution is = {k/M)t, as one may easily verify 

X Xq 

by going back to equation ( 3 ). 

e. Atmospheric pressure at any height. 

The increment of pressure between two points in the atmosphere differing 
in height by dh is dP = --pgdh, if p is the density at height h. But p is 
related to P by the expression Pp~^ = PoPo"^, which is valid for adiabatic 
expansion of air if y is taken to be 1.4.^ The quantities Pq and po are the 
sea level values of P and p. Therefore 

/p\ 1/7 

dP = - j pogdh 

and this, on integration, gives ( == 1 — ^ constant 

\Po/ 7 Po 

of integration being adjusted so that P = Pq at /i = 0 . 


the constant 


f. Homogeneous gas reactions. 

Chemical reactions involving but a single phase are said to be homogeneous. 
Among these there may be distinguished unimolecular, bimolecular, ter- 
molecular reactions and so on. In the unimolecular case, the number of 
molecules undergoing a chemical change is at any instant proportional to 
the number of molecules present. The decomposition of nitrogen pentox- 
ide into oxygen and nitrogen tetroxide (2N2O5 —>02 + N2O4) is an exam- 
ple of this kind, the differential equation being similar to that describing 
radioactive decay (Example a). 

In a bimolecular reaction, of which there are numerous examples, sub- 
stances A and B form molecules of type C. If a and h are the original 
concentrations of A and B respectively, and x is the concentration of C at a 
given instant, then 

^ = kia - x)ib - x) 


* 7 is the ratio of the specific heat at constant pressure to that at constant volume. 
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To integrate this equation, the expression 


(a — x)(6 — x) 

the partial fractions — ^ 1 7 —^^ — 1 . We then have 

a — o\b — X a — xj 

1 r / dx ^ dx \ r 

a -- b J \6 — X a — x) J 


is resolved into 


kdt 


whence 


1 - a — X 

In ^ kt + C 


a — h b — x 


1 a 

Since x=0at^=0, c = ; In r , so that 

a ~ b h 


b(a — x) 


= 6 


(a— 6) /ft 


a(b — x) 

From this, the reaction rate is seen to be 


k = 


1 


In 


b{a — x) 


t{a — b) a(b — x) 
The concentration of substance C is 


X = 


a(l 


(a—b)kt 


) 


When the original concentrations a and b are equal, the expression for k 
becomes indeterminate, but on putting 6 = a + e and letting € approach 
zero, an expansion of the logarithm yields 


at a — X 

which is also seen to be a solution of the differential equation 

dx 


dt 


= k(a - xY 


Other types of reactions will be dealt with in the problems on p. 40. As to 
terminology, we note that a rate law for multimolecular reactions of the 
form 

dx 

— = /c(ai - x)^(a 2 - • • • (a, - x)”* 

dt 

is often said to describe a reaction of the n-th order, where 

8 

n = ^ni 

i 
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g. Clapeyrm’s equation. 

Any phase change of a substance which takes place at constant pressure and 
temperature conforms to Clapeyron’s equation : 

dP I 

dT ~ T{Yf - Vi) 

Here I represents the latent heat of the process, V / and V{ the volume per 
mole of the final and the initial phase respectively, and P the pressure. 
This equation may be applied to the process of sublimation, yielding an 
approximate expression for the vapor pressure as a function of the tempera- 
ture. In that case I, the latent heat of sublimation of the solid, is nearly 
constant over a range of temperatures, and F„ the volume of the solid, 
may be neglected in comparison with that of the vapor, V /. The vapor, 
though not a perfect gas, will be taken to satisfy F/ = RT/P. Clapeyron’s 
equation then becomes 

dP IP 
dT ~ RT^ 

which on integration gives 

P = 

an equation often called the Clausius-Clapeyron equation. This result is 
found to be valid over small ranges of temperature, for the vapor pressure of 
both solids and liquids. A more refined result may be obtained by intro- 
ducing for I a more adequate approximation. 

h. Centrifuge problem. 

When a cylinder of height h, filled with fluid, is rotating about its axis, the 
pressure within the fluid will not be constant but will depend on r. Con- 
sider a cylindrical shell of fluid of thickness dr, the surfaces of which are 
coaxial with the rotating vessel. The net force pushing inward on this 
shell is 2TrrhdP, This must equal the centripetal force due to the angular 
speed CO, namely mco^r, Iwhere m, the mass of the fluid, is given by 27rrhdr • p. 
Hence 

2TrrhdP = 2Trrhpdr • co^r 

(a) If the fluid is a liquid, the density, p, is constant and the solution is 

P = ipcoV Po 

(/3) If the fluid is a gas, P = cp (since PV = const.), the solution is 
then 
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i. Soap film. 

If a soap film is stretched between two circular wires, both having their 
planes perpendicular to the line joining their centers, it will form a figure 
of revolution about that line. At every point siich as P (cf. Fig. 1) the 
horizontal force acting around a vertical section of the film is the same. 
Hence 

2TryT cos 6 = const. 



where T is the surface tension of the film. But 


cos 6 


so that 


[-( 1)1 


, 2 - 1 - 1 /2 


T being a constant. 


which leads to 


Solving for the derivative, 

^ _ (y^ _ 

dx ~ / 


a: + Cl 
1 / = c cosh 


The constants c and C\ may be expressed in terms of the distance between 
the wires and their radius. The longitudinal section of the film is seen to 
be a catenary. 

The examples above seem sufficient to illustrate the method under dis- 
cussion. The problems leading to separable first order equations are very 
numerous. 


Problems. 

a. Helmholtz^ eqmtion. 

If a circuit has resistance R and inductance L, the current 7 in it obeys the differential 
equation 

L^ + RI = E 
dt 
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where E is the impressed or external electromotive force. Show that the grovoth of a 
current (E = const., / = 0 at I = 0) is described by 

7 = I (1 _ 

R 

and the decay « 0, / « /o at < = 0) by 

I = 

b. Solve the equation for termolecular reactions: 

dx 

— - = k(a — x)(b — x)(c — x). 
dt 

An.. (. - - ?J ■(. - :)*■■ - 

c. Solve the equation for opposing unimolecular and bimolecular reactions: 

dx 

— = k\(a — x) — k%x^ 
dt 


under the condition x = 0 at < = 0. 


Ans. ^ ^ ^ coth Ak<it 4* ^ where A^ - + 7 

X k\ M \ 4 k^/ 


Show that, when equilibrium is established {t ~ »), 

x^ ki 


a — X ki 

d. Solve the equation for consecutive unimolecular reactions of the type 


that is, 


Ans. 


ki k^ 


dn\ dn 2 

— = -fcmi, — = kini - k2n2 
(U at 


ns « (ni -h n 2 -h ns) U - 


-A:if 


k2 — k\ 


-f 






where ns =« amount of C present at t. 


e. A projectile is fired vertically into the air with initial velocity 7. (1) Find its 

speed at any height; (2) find the time at which it will have traversed a distance r. 
Note: the dififerential equation to be solved is 


dv dv 
dt ^ dr 


gR^ 


where g * acceleration due to gravity, R «= radius of the earth. 
Ana. (1) p = _ 2gR(l - 
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(2) If 7* > 

« - ( 7 * - 11^(7* - 2 gR+ ^y* _ rjr 

f, (v* - 2.« + ^f)"’ + (V - w« ^ 

(7* - 2gRyl^ |_ “ V -y (V^ - 2gRy'^ ^ ^ ii J j 

2.3. The Differential Equation is, or Can be Made, Exact. Linear 
Equations. — A differential equation, written in the form 

Adx + Bdy = 0 (2-4) 

where A and B are functions of x and is said to be exact if the left-hand 
side is an exact differential. The necessary and sufficient condition for this 
to be true was shown in sec. 1.7 to be equivalent to the Cauchy relations 

dA dB 
dy dx 

The equations considered in the foregoing section, where A was a function 
of X alone and B a function of y alone, are exact in the trivial sense that 
dA/dy - dB/dx = 0. 

Differential equations occurring in })ractice are rarely exact, but every 
equation of the form (4) can be made exact and then integrated. The 
device for doing this is to multiply it by a suitable factor known as the 
integrating factor. For instance, the ecjuation 



is not exact. It becomes exact on multiplication by xy. For it then takes 
the form 

which has the solution : 


xy 


— = const. 
3 


While an integrating factor exists for every equation of the form (4), it 
is not always easy to find. If the equation Is linear, however, that is if it 
can be written 
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this factor eq. (5) becomes 


£ iy(^) = 


where the abbreviation F{x) 
clearly, 


/ 


has been used. 


The solution is, 


y 


= e 




J* gdx + c 


‘ ( 2 - 6 ) 


This result is most useful, for the occurrence of linear equations is very 
frequent. 


Examples. 

a. Circuit containing inductance and resistance (Helmholtz' equation). 

This problem has already been discussed, but it may be instructive to solve 
the differential equation also by the method of eq. (6). We have 


Thus 


dt L L 

R RE 

/ = - and F=-i- ,=- 


(2-7) 


so that 


7 = 




+ ce 


r-{RIL)t 


and this agrees with our previous result (Problem a). 


b. Circuit with inductance and resistance; variable electromotive force. 

The present method involves the solution of eq. (7) when jE is a function of 
the time, in which case the equation can no longer be separated. Let us 
assume that 

.B = .^0 sin (at 



R Eq. 

— t ; ? = 


We then have 
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g-(RIL)t' 

L~ 

E 1 

"7 — — 9 0)^ — CO COS o)t) + 

L CO + CO 


Eo J sin co<d< + 


2.3 


where co^ has been written for jR/L, a quantity having the dimensions of a 
frequency. To fix the constant we assume that / (0) = 0, in which case 


JgQ 1 
L co^^ + co^ 


(co' sin o)t — 0 ) cos co^ + co6 " ^ 


The last term represents transient currents which disappear as soon as 



CO 


c. Radioactive decay of mother and daughter substances. 

Let A be the number of atoms of the mother substance (e.g., UI) and B 
the number of atoms of the daughter substance (e.g., UXi) at time tj Aq 
being the original value of A at ^ = 0. Let X 4 and X/? be the decay con- 
stants as defined in sec. 2.2a. The two substances satisfy the two dififer- 


^ Here and elsewhere, there occurs the integral 
evaluated if the sine is written as an exponential: 

sin X = — 

2i 


J* 6*^'^ si 


sin cotdt. This is easily 


Thus 


J* e“'‘ sin oitdt = f 






[ (co' — ico)e*"^ — (co' -h tco)e 

1 co'-f-l^ 


1 co'^ 4" co^ 


'2 I 2 
CO i- CO 

/ 

/3 = tan ^ — , 


(co sin wt — it) cos (Jit) = — ■ 


(co'* + 


COS (&!/ + /3) 


/ 


e"'* cos (aidt = 




(co' COS coi + CO sin coi) 


'2 I 2 
CO 4" 


Similarly: 
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ential equations 

dA ^ , dB 

Tt “ Tt * + 

When the solution of the first, A = is substituted in the second 

there results 


UbU \ 

— + = XyiAoe“‘^-A 

an equation which is linear in R and can be solved by formula (6). The 
solution is : 

B = e~^Bt I y* + c| 

= — ^ Ao{e-^^^ - e-^Bi) 


if we assume that B(0) = 0. Note that B will reach a maximum at time 
f - - In Xb 

~ 


Problem. A circuit contains capacitance C, resistance and is subject to ar 
electromotive force E. Calculate the instantaneous value of the electric charge q on the 
condenser, noting that it satisfies the differential equation 


«T+^=® 
dt C 


Am. For E = Eq sin wt, 


Q ^ 


Eq 1 
W c?2 + 0,2 


(co^ sin cat (M) cos cat -f” lae 


1 

RC 


2.4. Equations Reducible to Linear Form. — Of some mathematical 
interest is an equation of the form 

^ + fix)y = gix)y’' (2-8) 

because it can be made linear by the substitution y = This con- 

verts (8) into 

^ + (1 — n)fu =x (1 — n)g 

which can be solved by the method of the preceding section. Eq. (8) is 
often called Bernoulli’s equation. 



45 


HOMOUJQNEOUS DIPPERKNTIAL EQUATIONS 


2.6 


2.6. Homogeneous Differential Equations. — A first order equation is 
said to be homogeneous* if, the equation being written in the form 


Adx + Bdy ■» 0 

A and B are homogeneous functions of the same degree, i.e., 
Aitx,ty) = rA(x,t/); B{tx,ty) = t'*B(x,y) 


If this is true we can substitute y = vx, obtaining 

A{x,y) =- A{x,vx) =x“A(l,a); B{x,y) => x“B(l,») 
The original equation. 


is converted into 


^ A 

dx B 


dv A(l,v) 

by this substitution, and this equation is separable, yielding 

dv dx 

f{v) -V X 


Example. Lines of force. 

An equation closely related to the homogeneous type, and tractable by the 

^ A remark on the use of the word “ homogeneous ” in mathematios seems in order, 
for the term is used with several different meanings in different contexts. The following 
definitions correspond to the chief usages. 

1. Homogeneous function: /(xi,X 2 ,* ♦ Xn) is said to be homogeneous in all its vari- 
ables if, for any parameter, t, f{txiytx 2 ,' • -fxn) = f“/(xi,X 2 ,* • -Xn). a is the “ degree ” of 
the homogeneous function. 

2. Homogeneous equations: A set of simultaneous linear algebraic equations of the 
form 

n 

2 (l)tXt — Cjy y ~ 2, • • •, 71 

1-1 

in which the a’s are constants is said: to be homogeneous if all c's are zero. 

3. Homogeneous differential equations: (Two usages of the term!) 

a. A first order equation of the form Adx -f Bdy = 0 is said to be homogeneous if 
A (x,y) and B(x,t/) are homogeneous functions of the same degree. 

b. In general, F(x,t/,t/', 2 /^',- • •) == 0 is said to be homogeneous if F is a homogeneous 
ftfnction of y and all its derivatives, not necessarily of x. Thus 

^ -H/— iW • V - 0 

is homogeneous and linear. If the right-hand side of this equation were not zero but 
equal to a function of x, the equation would still be linear but no longer homogeneous. 
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substitution here described, is the differential equation for lines of force. 
A line of force is defined as that curve which is tangent, at every point 
through which it passes, to the force at that point. The present analysis 
is applicable to attracting mass points, attracting or repelling electric 



charges, and magnetic poles. Let it be desired, for example, to find the 
lines of force due to two charges, qi and ^ 2 , a distance 2a apart. (Cf. 
Fig. 2.) If we restrict our consideration to the plane containing the charges 
and the point P, then, for every point in this plane, the definition of a line 
of force requires that 


dx 


h 

Px 


^ (y + a) + ^ (i/ ~ a) 
n ^2 






(h 

^2 


(2-9) 


If a were zero, this would reduce to dy/dx = y/x, an equation which has 
for its solution all straight lines through the origin. These, as is well 
known, represent the lines of force due to a point charge. In general, 
however, eq. (9) reads 

% [xdy - (2/ + a)dx] + % [xdy - (y - a)dx\ = 0 (2-9a) 

Ti 72 


This equation misses being homogeneous by the presence of the quantity a. 
But a simple artifice will help. If we introduce two new dependent 
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variables, y\ = y + a and y 2 = y — a, so that dyi = dy 2 = dy; ri = 
(x^ + yiy^, r 2 = (x^ + eq. (9a) takes the form 

xdyi - y idx , xdy2 - y2dx 
(x2 + y\fl^ + + ylf^ “ ^ 

each part of which is homogeneous. Now put yi = v^x, 1/2 = f 2 X so that 

x'^dv = xdy — ydx 

The result is then simply 


dvi 


dV2 


(1 + 


= 0 


When this is integrated, we immediately obtain the equation of the lines 
of force due to the two charges: 




Vl 


(1 + vi) 


,2x1/2 


+ q2 


V2 


(1 + 


(liVi . (I2V2 

1 = const. 

r2 


2.6. Note on Singular Solutions. Clairaut^s Equation. — A first order 
equation of degree higher than the first may have a special kind of solution 
which is not obtainable by specifying the constants in its general solution. 
Thus consider 


y 



( 2 - 10 ) 


This equation may be solved b}^ the following artifice. Differentiate once 
more, thus converting it into a second order equation, which, however, can 
easily be handled by the methods already discussed. The result is 


d?y dydj^y 

dx dx ^ dx^ dx dx^ 


or 



d^y 

dx^ 


= 0 


( 2 - 11 ) 


If now the first factor be cancelled, the equation is 



and has the solution y = Cix + C 2 . This, however, is too general a result 
since it contains two constants of integration, a circumstance brought 
about by the arbitrary procedure of converting the original first order 
into a second order equation before solving. To satisfy eq. (10), it is 
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necessary to substitute this solution and adjust C 2 in conformity with its 
demands. It is then seen that C 2 = cf, and 

y = cx + 

is the general solution of eq. (10). 

But eq. (11) can also be satisfied by equating the first factor on the left 
to zero. This leads to 

X + 2— =0, or y c 

ax 4 

This will satisfy eq. (10) if c = 0. Thus 



is another solution of the original differential equation, but one which is not 
derivable from its complete solution. It is called a singular solution. 
Inspection will show that it represents the envelope of all the straight lines 
which correspond to the complete solution. This is generally the meaning 
of singular solutions. 

An equation of the form 

dy , .(dy\ 

is known to mathematicians as Clairaut’s equation. Eq. (10) is a specimen 
of this type. Clairaut’s equation can always be handled by the method 
here used and has the general solution 

y = cx f(c) 

EQUATIOKS OF HIGHER ORDER 

A general method for solving certain differential equations of higher 
order will be presented in secs. 2.10-12. It seems appropriate, however, to 
discuss first a few special types of differential equations which can be solved 
by elementary means. While the theory given in this section is applicable 
to equations of any order, emphasis will be placed solely on second order 
equations because of their prominence in mathematical physics. 

2.7. Linear Equations with Constant Coefficients; Right-Hand Mem- 
ber Zero. — In discussing this type of equation it becomes convenient to 
introduce a new notation; we write D - d/dx. A symbol such as D, 
which is meaningless unless applied to a function of x, and which is there- 
fore not a mathematical quantity in the usual sense, bears the name 
“operator.” In the present connection D may be regarded as nothing 
more than an abbreviation. Later, however, when the mathematics of 
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quantum mechanics is to be studied, it will be found that operators such as 
D are entities of considerable significance which give rise to an operator 
algebra quite different in many respects from ordinary algebra. For the 
present we merely observe that a differential equation of the type under 
discussion in its most general form may be written: 

D^y + aiD'^'^^y + + • • • Uni/ = 0 (2-12) 

The a^s are constants; the order of the equation is Ji. Consider now the 
differential equation 

(Z) ~ n){D - r 2 ) • • • (D - rn)y = 0 (2-13) 

which must be understood to mean that the successive application of 
d/dx — Tn, d/dx — etc., upon y is to yield zero, the r’s being constants. 
It is clear that (12) and (13) become identical when the r's are chosen to be 
the roots of the algebraic equation 

+ . . . + - 0 (2-14) 

Let us then attempt to solve ( 13). A particular solution of that equation is 
easily found, for if y satisfies 

{D -* rn)y = 0 

it will also satisfy (13), since further differentiations and multiplications by 
r will leave the right-hand side unchanged. But (D — rn)y 0 has the 
solution y = hence this is a particular solution of (13). 

Furthermore, we observe that the order of the factors ” (D — n) 
appearing in (13) is insignificant. Hence any factor may be written last, 
and this means that is also a particular solution, and so on. On 

adding all particular solutions, i.e., on putting 

y = (2-15) 

t 

there results a solution with n independent arbitrary constants, and this 
must therefore be the complete solution. To summarize: in order to solve 
(12), first determine the roots of (13), which is known as the auxiliary 
equation. If these roots ajje denoted by r^, the general solution is (15). 

One point is to be noted. If the coefficients a appearing in (12) are 
functions of x, the decomposition into factors leading to (13) cannot be 
made by solving the auxiliary equation. The reason is that then the r's 
will also be functions of x, and 

- ri)(D - r2)y 9^ (D - r2)(Z) - rx)y 

as the reader may easily verify. This state of affairs is expressed succinctly 
by saying that the operators (D — ri) and (D — r 2 ) are commutative only 
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if the r^s are constants. For variable r’s the order of the factors in (13) 
is also essential, so that the whole method of solution here discussed must 
fail. 

Returning to the case of constant coefficients, one minor difficulty must 
be considered. Suppose that two roots of the auxiliary equation are equal. 
If they are called ri the supposedly general solution will contain the part 
(ci + which is equivalent to One arbitrary constant has been 

lost and the solution obtained is no longer complete. To remove this 
fault we consider the two factors of (13) which gave rise to it and study the 
equation 

(D - nfy = 0 (2-16) 


One solution is certainly y = Let us look for a general solution of the 
form y =/(a:)/'''. On substitution of this into (16) there results the 
following differential equation fov f{x): 



Hence/ = CiX + C 2 , and the complete solution of (16) reads 

y = {cix + C2)e^^^ 

This shows that, when two roots of the auxiliary equation are equal and 
have the value ri, the part of the solution (ci + C 2 )e^^^ occurring in (15) 
must be replaced by {cix + An extension of this argument leads 

to the general result: If ri is a ^-fold root of the auxiliary equation, the 
complete solution of (12) is 

y = cie’’"'*' + + • • • c,(l + uix + + • • • + + . . . 


Examples. 

a. Simple harmonic motion. 

When the force on a particle of mass m moving along the y-axis is equal to 
~ky, Newton \s second law of motion reads: 

<fy 


m 


df 


= —ky 


Here k, the force per unit of displacement of the particle, is known as the 
stiffness of the oscillator. If we denote the positive constant k/m by 
the equation becomes d^y/dt^ + u^y = 0. The roots of the auxiliary 
equation = Oarerj = mo, r2 = —mo. Hence by (15) 

y = cie‘"' + Cae-*"' 

The constants ci and Ca may of course be complex. This result may be 
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written in two other, but equivalent, forms. On expanding the exponen- 
tials in sines and cosines we obtain 

2/ = (ci + C 2 ) cos o)t + (ci — € 2)1 sin o)t = Cl cos o)t + C2 sin cot 


This last result may also be stated as follows: 

^ = A sin (cot + 5) = a' cos (cot + 8') 


where the new constants A, 5, and A', 8' are related to Ci and C 2 
by A sin 5 = Ci, A cos 8- = C 2 ] A' cos 5' = Ci, —A' sin 8' = C 2 , or con- 
versely A^ = A'^ = Cl + C 2 , 6 = tan“^ Ci/C 2 i 5' = tan””^ C 2 /C 1 . 


b. Chain sliding over a smooth peg. 

The chain "(cf. Fig. 3) is sliding over the peg, the 
right end moving do\vnward. Let the displacement 
of this end from 0, the point it would occupy in equi- 
librium, be y. If the linear density of the chain is X, 
and its total length Z, the mass to be accelerated 
is l\. The resultant force is 2\yg. Hence, from 
Newton ^s second law, 


/X 


d^y 

J = 2X.2/, 


or 


dt^ 


2g 

--TJ/ = 0 


1 

t 

y 


The auxiliary ecpiation has the roots leading 

to the general solution y = + C 26 

The constants may be fixed by supposing that, when t 
dy/dt - 0. Then ci + C 2 = 2 / 0 ,’ Ci — C 2 = 0; and 



= 0, 1/ = j/o and 


y ^yo g-V2i7r<) 

2 


= 2/0 cosh 



c. Damped simple harmonic motion. 

Wlien the motion of the oscillator considered in example (a) is damped, 
there is present, besides the restoring force —A?/, a damping force propor- 
tional (at small velocities) to —l(dy/dt)y the negative sign indicating that 
the force retards the motion; I is kno\vn as the damping constant. The 
differential equation describing the motion is 

if 6 is written for the constant quantity l/2m. The auxiliary equation has 
the roots — 6 ± so that the general solution becomes 
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To adjust the constants in conformity with physical conditions we suppose 
that, at f = 0, y = j/o and dy/dt = 0. Then with the use of the abbrevia- 
tion R = — £0^ 


Several special cases are of interest in this connection. 

Ca) b > 0 ). R is then real, but smaller than. 6. Hence both terms of 
(18) represent an exponential decrease. The motion is not oscillatory. 

(/5) b = 0 ), Then J? = 0, and y = The motion is not oscilla- 

tory; it is said to be critically damped. 

(7) b < 0 ). Then R is imaginary and may be written R - tco', 
w'* = co^ — b^. Eq. (18) now reads 


y - yoe ^cos G)'t + ^ sin 


or, in equivalent form, 


y yoe sin (Jt + 8) 

CO 

where 8 = tan*”^ o)'/b. This represents a damped sinusoidal motion of 
period T = 27r/V — the amplitude decreases exponentially as 


d. Natural oscillations in an electrical circuit. 

In a circuit containing iJ, L, and C, the sum of the partial electromotive 
forces due to inductance, resistance and capacitance equals the external 
e.m.f. If the latter is zero (natural oscillations) we have 



+ R1 + I 


= 0 


or, remembering that / = dq/dt, 


A 1 

Ldt'^ LC^ 


= 0 


This equation is of the form (17); the constants are 6 = R/2L, 
u = (LC)~^^^. The solutions are already given in the foregoing example. 
In particular, if oscillations are to take place, u > b, i.e., 2\/ L/C > R. 
In that case 


? = 
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and 


d = tan ^ 



- 1 


The initial conditions here are that at < =0, the condenser has a charge 
qo and there is no current. 

2.8. Linear Equations with Constant Coefficients ; Right-Hand Mem- 
ber a Function of x. — We now restrict our considerations to differential 
equations of the second order. In terms of the notation of the foregoing 
section, the problem is to solve 

(D^ + aiD + a 2 )y = f(x) (2-19) 

If the roots of the auxiliary equation are ri and r 2 , this equation takes the 
form 

(D - ri)(D - r 2 )y = f(x) (2-20) 

Put (D — r 2 )y = u, so that (D — ri)u = f(x). This is a linear first order 
equation which can be solved by the method of sec. 3. It gives 

u = J e^^^f(x)dx + Cia**^* = e^^^(<p(x) + ci) 

- (f)(x). If this is substituted back into the 

0 ^ 

definition of u, the result is (D — r 2 )y = e^^^((p(x) + ci), an equation 
which may again be treated in accordance with formula (6). Hence 

y ^ J g(n~r2)x [fp(x) + Cildx + C2e*^* 

= e'** r H — e’’'* + C2e’^ 

J n - r2 

On changing the meaning of the constant ci, we write the solution of (19) 
y = f e^ri-rt'>^^(x)dx + Cie’'** + Cze’^ (2-21) 


The form of this solution is interesting. The last two terms are identi- 
cal with the solution of the homogeneous equation. They are called the 


complementary function, 


while the remainder, 


pTlX 



is 


known as the particular integral. Thus the “ inhomogeneity ” of the 
equation, f{x), makes its appearance in the particular integral only. It is 
sometimes possible to find the particular integral of an equation like (19) 
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by inspection, that is, by selecting any function which will satisfy the equa- 
tion. When this is available one can make use of the fact just noted and 
form the complete solution by adding to this function the general solution 
of the homogeneous equation. Usually, however, the straightforward cal- 
culation of the particular integral is hardly more difficult. 

The particular integral can be written in a form which is often more 
conjpniei^ in practice. On performing a partial integration we find 




^(ri— r2)x 




n - rj 

,(ri— r2)x 


J ri 


,(ri— r2)x 


dip 


r\ - r 2 


r 2 dx 

J* e^^^^f{x)dx — J 


dx 


n - r2 


-f(x)dx 


because d(p/dx = e ^^^f(x). The particular integral then becomes 
7^^ J* e“’'‘7(x)da; - e"** J e-’'^7(x)dxj 


and finally 


y ” ^ J ^ + C2e’*** 


( 2 - 22 ) 


Examples. 

a. Forced oscillations of a mechanical or electrical system. 

The equation to be considered is (17) but with a function of t instead of 
zero on the right. In most applications this function, which represents the 
impressed force divided by the mass of the oscillating system in the mechan- 
ical case, is a sinusoidal function of the time. Hence we are dealing with 
the differential equation 

^ + 26 ^ + w^j/ = /o sin at (2-23) 

at at 

As in sec. 7, example (c), the auxiliary equation has the roots 
ri = —b + r2 = —b— — ^2 

If again we denote Vb^ — by R, the particular integral is 

P.I. = — — J sin atdt j sin atdt 

The integrals here may be evaluated by means of the formulas on p. 43. 
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When this is done and the terms are suitably collected, 


P.I. = 


[< 


b- R 


b + R 


— 5 sin a< — 

a J 


/o 

2R iL(6 - R)^ + a^ (6 + + 

(b - Rf + - (6 + Rf + aj «<} 


/o 




{ (o)^ — a^) sin — 26 a cos atf*' 


To obtain the complete solution we must add to this the solution of (17). 
Hence 


^ (co^ - + 4aV ^ - 26acosa<} 

+ + C2e“^0 (2-24) 

It is seen that the complementary function decays exponentially with t 
and will be damped out eventually. It is therefore of little interest in 
physical applications. The amplitude of the oscillations, 

7o 

(co2 - a2)2 + 


has a maximum when the impressed (angular) frequency has the value 

a = (0)2 - 262)1/2 


This is said to be the condition of resonance between the impressed force 
and the vibrating system. If 6 is zero the^e occui's what is sometimes 
referred to as the “ resonance catastrophe,^' for in that case the amplitude 
is infinite when a = o;. 

(a) Mechanical system. 

The present theory can be applied, for instance, to a mass m held in equilib- 
rium by a spring of stiffness k and damping constant L We then have, as 
in sec. 7c, 


b 


2m ^ 



m 


Resonance occurs when 


a 


k _ 

jn 2m^/ 


(fi) Electrical system. 

For an electrical system with an impressed electromotive force Eq sin at we 
have (cf. sec. 7d), 6 = R/2L, w = (LC)~“i/2^ ^ Eq/L, Resonance 
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occurs when 


a 


(Jl ^ 

\LC 2LV 


The solution (24) represents the charge, g, residing on the condenser at any 
instant. The current I is obtained by differentiating q with respect to the 
time. Both terms in braces then become positive, and 


I = A[(o)^ — a^) cos at + 2ba sin at\ 


where A stands for EQa/L[{(»P' — 4* ^a%^]. The power expended 

in the circuit is / EldL This integral contains two terms, one with the 

«/o 

integrand sin at cos at^ the other with the integrand sin^ at The first of 
these is 0 provided T is taken large enough to include a great number of 

cycles 27r/a, the last gives / atdt = T /2. Hence the power expended 


IS 


AbaT 


The part of the current proportional to cos at causes no power consumption; 
it is a wattless current which is always out of phase with the impressed 
electromotive force. 


b. Electrical polarization. 

An equation like (23) also describes the response of ordinary matter to an 
impinging electromagnetic wave. A light wave, for instance, which is 
polarized in such a way thaPb its electric vector is along y, when incident 
upon an electron inside a refracting medium, will exert a force equal to 
cEq sin at upon this electron. Here Eq is the amplitude of the electric 
vector of the light wave, e the charge on an electron, a the frequency of the 
light (assumed monochromatic), fo in (23) is then (e/m) Eqj m being the 
electron mass. The solution is given by (24). y represents the displace- 
ment of the electron under consideration at the time t This gives rise to a 
dipole of moment ey. By polarization is meant the dipole moment per 
unit volume of the material, and this is obtained on multiplying the dipole 
moment due to one electron by the number of displaceable electrons per 
unit volume. If this number is iV, then the polarization 

p _ ^ cos at}‘ 

m (w^ — a^)^ + 4a;^6^ 

Further considerations of a physical nature® show how the index of refrac- 

* See, for instance, Page, L., ‘‘ Introduction to Theoretical Physics/^ D. Van 
Nostrand Co., p. 532 et seq. 
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tion and the conductivity of the substance may be deduced very easily 
from this expression for P. 

2.9. Other Special Forms of Second Order Differential Equations. — 

a. An equation of the type 

S-/W (2-25) 

can be integrated by the method of sec. 8. If this is done, only formula 
(21) is applicable, for the second formula (22) involves the quantity 



ri — r 2 , which is zero, the auxiliary equation corresponding to (23) having 
equal roots: ri = r 2 = 0. The solution is 


y = J <p{x)dx 


+ Cl + C2X = 




+ Cl 4- CzX 


This procedure is here very artificial, of course, for this result could have 
been obtained directly by integrating (25) twice. 


Example. Suspension bridge. 

Consider the part of the cable between A and the variable point P. It is in 
equilibrium under the action of three forces; the horizontal force, H, the 
tension, T, at P, and the weight W of, or supported by, AP, which of course 
need not act at the middle of the segment. Hence we have 


T sin 0 = W; 


T cos 6 = H 


. tan 6 = — 
dx 


W 

H 


This relation is true for every point P, provided W is the load between A 
and P. It is generally more convenient to write the equation in terms of 
w = dlT/da:, i.e., the load per unit horizontal distance; w = w(x): 

d^y w{x) 
dx^ 


H 


(2-26) 



2.9 


ORDINARY DIFFERENTIAL EQUATIONS 


58 


In the case of the suspension bridge, the load is uniform along x, hence 
10 = const. 

Solution: 

y ^ + ^ 2 , a parabola 

2H 


b. Equations not containing y. 
If the equation to be solved is 



introduce the new variable p = dy/dx. The resulting equation 


dx 




can then be solved by one of the methods already discussed. 


Example. Cable hanging under its own weight. 

The equation describing the cable is (26), but w is not constant. In this 
case it is dW/ds, the weight per unit length of cable, which is constant, 
provided the latter is uniform. Put dW/ds = X. Then 

^ = A h + { 

dx^ H dx H \ \dx/ 

From this dp/\/l + = (X/H)dx, so that 

sinh~^ p = ~x + Ci^ 

H 

If the origin is chosen at the lowest point of the cable, c\ = 0, and 

^ = sinh ^ X) 2/ = “ cosh a; + c = — f cosh a; — 1^ 
dx H \ H \ \ H / 


This curve is known as a catenary, 
c. Equations not containing x. 



Again we put dy/dx = p, but now we write 

d^y dp dp dy dp 
dx^ dx dy dx ^ dy 
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2.10. Qualitative Considerations Regarding Eq. 27. — Before turning to 
the consideration of exact solutions of (27), a few remarks concerning their 
qualitative behavior in limited domains of x may be of value. To survey 
their behavior, it is often advisable to remove the first derivative occurring 
in (27), which is always possible by means of a simple transformation of the 
dependent variable. Instead of t/, we introduce r, related to y by 

y = 

When this is substituted into (27) and the exponential factor is then 
cancelled, there results an equation for v: 

v' + {X2 - ix[ - lx[^)v = 0 

from which the first derivative is absent. This represents essentially a 
relation between v and the curvature of v and may be written 


v" =f(x)v 



One fact is at once apparent: provided v is finite, it has a point of 
inflexion wherever f(x) = 0. Furthermore, in regions where f(x) > 0 
two facts are to be noted : If vis positive and has a positive slope, the slope 
will continually^ increase as x increases, causing v to grow rapidly; if t; is 
positive and has a negative slope, the positive v" will continually diminish 
its steepness, causing v to approach the x-axis and then in general to turn 
upwards again. For negative v the words ‘‘ positive ” and negative 
in the preceding sentence should be interchanged. This qualitative 
behavior is most easily remembered if we think of the special case in which 
f{x) = const. = > 0. The solution is then 

0 V = Cie"* + 026”’"® 

which typifies the foregoing remarks. 
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If, however, we consider a region in which /(rc) < 0, the slope of positive 
V will be continually diminished. Thus if v starts out with positive slope 
this will soon be zero and then decrease until y = 0; as t; then becomes 
negative its negative slope will increase until it is horizontal and v turns 
back toward zero. In short, v is oscillatory. This again is easily remem- 
bered if we consider the special case in which /(x) = — < 0 for it has the 

solution V = c sin (wx + 5). 

Fig. 5 illustrates these facts. To the left of t; oscillates; at A it has a 
point of inflexion; to the right of A it is of exponential behavior. 

2.11. Example of Integration in Series. Legendre’s Equation. — 
To illustrate the method of series integration, let us postpone fundamental 
matters and start by studying a specific example. An equation of consider- 
able interest is Legendre^s; it has the form 

(1 - x^)y'' - 2xy' + + l)y = 0 (2-28) 

in which I is a constant. We attempt to find a solution which is a series in 
positive powers of x. If the lowest power occurring is #c, this solution will 
have the general form 

y = L (2-29) 

X-1 

Solving the differential equation then amounts to determining the coeffi- 
cients a\. Whether the series converges can be tested after this has been 
achieved. At present it will be assumed that this is the case, and that (29) 
may be differentiated term by term. When (29) is substituted in (28) 
the result is 

E ax(K + X)(<c + X - l)x^’^-2'- r ax[(<c + X)(k + X - 1) 

X - X 

+ 2(k + X) - l(J. + l)]x^’‘ = 0 (2-30) 

This equation must hold for every value of x, and this can be true only if 
the coefficient of every power of a: is identically zero. Since X cannot, by? 
hypothesis, be negative, the lowest power of x occurring in (30) is 3f~^, 
and it is present only in the first summation of (30). Thus we find, put- 
ting X = 0 to obtain the term in question, 

OoK^K - 1) = 0 (2-31) 

Oo is the lowest coefficient in our summation and hence not zero. Equa- 
tion (31) therefore determines k. It is often called the indicwl eguation. 
Clearly, two values of k are permissible: 

K = 0, 1 

Next, we see what further information eq. (30) will give. According to 
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the foregoing, the coefficient of must vanish for every positive integer j. 
Now the term corresponding to the (k + i)“th power of x is obtained in the 
first summation by putting X = j + 2, in the second by putting X j. 
Hence 

+ i + 2)(<c + i + 1) = + i)(^ + i + 1) “• i{i + 1)] 


(ic + i)(/c+j + i)~za + i) 

(K+j+ l)(K+j + 2) 


(2-32) 


Thus, if aj is given, aj ^2 can be computed from this relation. Starting 
with oo, (32) permits us to obtain, successively, a 2 , ^ 4 , etc. ; uq, however, is 
arbitrary ; it is one of the two arbitrary constants appearing in the general 
solution of a second order differential equation. On the other hand, if ai 
is assigned arbitrarily, all coefficients with odd subscripts are deducible from 
(32). 

Choice 1. Let us take k = 0, Eq. (32) then reads 


j(j + 1) - l(l + 1) 

0- + i)0’ + 2) 

On taking oq and ai as arbitrary constants, the solution becomes 


= 0 - 


+ (x + 


l(l + 1) 2 Q-l(l + 1) Ijl + 1) 4 


2-l{l + 1) , 2-1(1+ 1) 12 - l{l + 1) 6 


(2-33) 


1(1 + 1) 2 . i(i-2)(l+ l)(Z4-3) 4 . 

■ V ^ + 4! ^ 

. . ..rl(l - 2) • • • (i - 2r + 2)(l + 1) . . . (i + 2r - 1) ,, 

(2r)! ^ 


, / (1- l)(l + 2) 3 , (I- m - S)(l + 2){l + 4 ) 3 , 

3! * 5! X +•• 

. , .s, (I - Da - 3) • • • (i - 2r + l)(i + 2)---(l + 2r) 

(2r+l)! ® 


Choice 2. Let us take #c = 1. Eq. 5 then reads 

- 0 + 2) — l{l + 1) 

“ (^■ + 2)(j + 3) 


( 2 - 34 ') 
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If now we take again cq and ai as arbitrary constants, we find 


(“ 


+ x[x + 


6-l(l+l) 3 Q-l(l+ 1) 20 - l(l + 1) 5 


x^ + 


X° + 


12 12 30 

(^- i)(^ + 2) , . (^- i)(;-3)a + 2)(; + 4) , 

— . . 'TT*' -X- — — — 


<= X ^1 — 


3, a: "s! ' '*’ + ---Iao 


) 

) 


oo 

ai 


(I - 2)(l + 3) ^3 ^ (Z-2)(f-4)a + 3)(f + 5) _3 


12 


360 


a:" -1- • • oi 
(2-35') 


The terms multiplying ao in (35^) are seen to be identical with those multi- 
plying ai in (34'); hence these two particular solutions are the same. The 
second part of (35'), however, does not agree with the first of (34'), both 
of which represent series in even powers of x, It might seem, therefore, as 
if we had obtained altogether three independent solutions, which is, of 
course, impossible. But closer inspection would show that the second part 
of (35') is not a solution at all. This is seen at once if, after assuming any 
specific value for /, we substitute it back into the differential equation. 
The trouble is that, putting k = 1 and Qq = 0, we have carelessly discarded 
any constant term which might appear in the sequence. The present 
example indicates ck'arly that the solution of a differential equation is not 
an altogether mechanical matter and that caution must be used at every 
step. Summarizing, we observe that the significant parts of (34') and 
(35') are: 




i(i + 1 ) 

2! 


x^ + 


l{l - 2){l + 1)(Z + 3) 
4! 


X* + ■■■ + (-I)'- 


Z(Z - 2) ■ • ■ (Z - 2r + 2)(Z + 1) • • • (Z + 2r - 1 ) 
(2r)! * 

+ • • • J Go 


(2-34) 


r (Z-l)(Z + 2) 3 . (Z- l)(Z-3)(Z + 2)(Z + 4) , . 

"L* 51 * +" 

•J. r nr (Z - 1)(Z - 3) • • • (Z - 2r + 1)(Z + 2) • • • (Z + 2r) 3^, 



(2-35) 


Problem. Show that the equation y" -4- y 0, if integrated in series, has two 
particular solutions, one of which may be identified with the cosine series, the other with 
the sine series. 
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One further point should be observed. When any one term in (34) is 
zero, all succeeding terms vanish also and the series becomes a polynomial. 
The conditions under which infinite series like (34) reduce to polynomials 
are of great importance in many physical problems and will be discussed 
more fully later. 

The work thus far has only established the fact that the series (34) and 
(35) are formal solutions of Legendre^s equation, that is, they would 
satisfy (28) if substituted in it. Whether the solutions are of any interest 
depends on their convergence properties. A series converges if the ratio of 
the absolute values of two successive terms, 

I My+2 1 

Uj 


is smaller than unity for large j. Now this ratio is clearly 


But 


^i+2 


X 


2 



is immediately obtainable from (33). As j ♦ oo it becomes 1. Hence 
the condition that (34) and (35) converge is that < 1, and this is true as 
long as I x I < 1. For values of x in the range — 1 < x < 1 our solution is 
a significant one; for other values it fails. Is it possible to construct a 
solution valid for | x^ 1 > 1? This is indeed not difficult. 

Let us suppose that y, instead of being given by (29), has the form 
y = Eq- (30) will then read 

Zax(/c - X)(ic - X - i)ar'^-2 

X 

-LoxK'c - x)(<c - X + 1) - - 0 

X 


K now denotes the highest power occurring in the series. The indicia! equa- 
tion is obtained by putting the coefficient of the highest power of x equal to 
zero. Thus 

k(k + 1) - + 1) = 0 

whence 


K = I or —I — 1 


As before, the coefficient of must vanish for every positive integer j. 
This implies 

a,- 2 (<c - i -H 2)(/c - y + 1) = a,[(<t - ;)(« - j -t- 1) - + 1)] 
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or, on replacing j by j 4- 2, 


i)(« - i - 1) 


{•c-j- 2) (K-j-l) - l{l + 1) 
Choice 1. Let us take k = 1. Eq. (36) then reads 

{I - j)il -j-l) 

“ (i + 2)(j -21 + 1) 

If oo is chosen arbitrarily, the series becomes 

. = - 1 ) ^-2 , 

^ V 2(2i-l) ^ 8(21 - 1)(2« - 3) 


(2-36) 


. = - 1) ^-2 , i{i-m-2){i-3) 

^ V 2(21-1) ^ 8(21 - 1)(21 - 3) 

r _ ir (l-2r+l)(l-2r + 2)---(l-l)l \ 

^ ' 2r • • • 2(21 - 2r + 1) • • • (21 - 1) / 


(2-37) 


The series formally obtained by putting ao = 0 is of no interest since it 
violates the assumption, previously made, that k, i.e., 1, represents the 
highest power of the sequence. We shall therefore omit it at once. 

Choice 2. Let us take »c = — 1 — 1. Then 

(J + 1 + 1)(J + 1 + 2) _ 

(> + 2)(21 + j + 3) ’ 

If again we put a\ - 0, there results the particular solution 


(l + l){l + 2) ,, . (l+l)(l+2)(l + 3){l+4) 


2(2Z + 3) 2- 

(^ + 1) ■ - + 2r) 

2r(2Z + 3) • • • (2Z + 2r + 1) 


2.4(2^ + 3)(2i + 5) 




(2-38) 


The two solutions (37) and (38) are independent, hence their sum repre- 
sents the general solution of Legendre^s equation. It is easily seen to con- 
verge if I X I > 1, unless I has such a value that the denominator of one of 
the coefficients in the series vanishes. This case will be studied shortly. 

We are now in possession of two forms of solution of eq. (28). The first 
(eqs. 34 and 35) converges when | x | < 1, the second (eqs. 37 and 38) 
when I X I > 1. Under special circumstances, however, (34) or (35) as 
well as (37) or (38) may become polynomials, which remain finite for every 
finite value of x. It is interesting to see what happens to the various par- 
ticular solutions when this contingency arises. 

Eq. (34) reduces to a polynomial when I is an even positive or an odd 
negative integer (or zero). 




2.11 


ORDINARY DIFFERENTIAL EQUATIONS 


66 


On the other hand, (37) becomes under these conditions 



l{l - 1) 

2{2l - 1 ) 


X ^ + • • • 



^ ’ l{l-2)---2{l+\)--- {21- 1) 



These two solutions become identical if the second is multiplied by the 
constant factor 

w/2 - 2) > - 2a + 1) ♦ - {21 ~ 1) 

^ ^ I! 


Hence the particular solution (34) coalesces with (37). 

b. Let I be odd and negative. Inspection shows that (34) now becomes 
identical with (38). 

Eq. (35) reduces to a polynomial when I is an odd positive or an even 
negative integer. 

c. If I is odd and positive, (35) reads 


/ (l-l)(l + 2)^, , 

2/ = a ( X ; 7 ; + 


3! 

+ (-l)(^~l)/2 
while (37) becomes 


(«~ l)(Z-3)---2(Z + 2)... ( 21 - 1) 


y = ax^ — 


IV - 1) ^ 


2{2l - 1 ) 


/! 




')■ 


2.4- •• (i- l)(Z + 2)-- - {21- 1) 




These two expressions become identical when the second is multiplied by 
the coefficient of its last term in parenthesis. 


d. If Z is an even and negative integer (35) turns into (38). 

Having established these important relations between solutions (34)- 
(38) we now return to the consideration of (37) and (38). Solutions (37) 
and (38) for integral values of I are of great importance in mathematical 
physics. If the constant uq in (37) is chosen to be 

(2Z)! _ (21 - 1)(2Z - 3) • - 1 
2^(Z!)2 “ ' U 


the resulting polynomial of degree I is called a Legendre polynomial (or 
Legendre coefficient or “ zonal harmonic ”). It is usually denoted by P;. 



67 


EXAMPLE OF INTEGRATION IN SERIES 


2.11 


For purposes of reference we write it down again: 

„ , , 1 • 3 • 5 • • • (2Z - 1) 

p.(») J -, — - 

( , _ t(i - 1) l(l - 1)(1 - 2)(i - 3) 

1 2(2Z - 1) 2-4(2i - l)(2f - 3) 


(2-39) 


The series here is to be continued down to the constant term. On the 
other hand, (38) with the constant oo chosen to be 2^(n)V(2i + 1) !, 
I being a positive integer, is often denoted by Qi. It is an infinite series : 


Qi 


I! 


1-3 


(21 + 1 ) 
(i + 1) ' 


( 1 + 1 ) ( I + 2 ) ,_3 

^ 2(21 + 3) ^ 


(I + 2r) 


2 • 4 • • • 2r(2l + 3) • • • (2i + 2r + 1) 




+ 


( 2 ^ 0 ) 


The following facts will be noted: 

When I is a positive integer, (37) is a polynomial, but (38) is an infinite 
series. The general solution of (28) is a linear combination of (37) 
and (38). 

When I is a negative integer, (37) is an infinite series, and (38) is a 
polynomial. The general solution of (28) is a linear combination of (37) 
and (38). 

When 21 is equal to some positive odd integer, solution (37) degenerates 
into (38). To see this, suppose 21 - 2n — 1. There will then appear a 
vanishing denominator in the coefficient of and in every subsequent 
term of (37). To remove these infinities one may multiply the entire series 
by (n — r), which causes all terms of order higher than I — 2n to vanish 
while the others remain finite. Hence the scries begins with the power 
^ and inspection shows it then to be identical with (38). 

In this case, our method has yielded but one particular solution, and this is 
an infinite series. Procedures leading to a general solution are discussed in 
treatises on Differential Equations.^ 

When 21 is equal to an odd negative integer, (38) degenerates into (37) 
in a manner similar to the above. In that case also no general solution can 
be obtained by the present method. 

Having now given a fairly complete mathematical analysis of the solu- 
tions of Legendre’s equation, we state some conclusions of practical impor- 
tance. In almost all applications (cf. Chapters 7, 8, 11) the independent 
variable x appearing in eq. (28) is the cosine of an angle. The functions of 
interest are therefore those which remain finite for all values which x = cos 6 
can assume; these values include x = zfcl. Such functions exist only when 

^ See Forsyth, A. R., “ Differential Equations, Macmillan Co., London, 1914. 
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I is a positive or a negative integer, as we have shown. But when I is an 
integer, consideration may be limited to solutions (37) and (38), because 
the others reduce to these. Moreover, inspection shows that solution (38) 
with I replaced by — (/ + 1) is the same as solution (37). Hence we may 
further limit our consideration to positive values of I (including 0) and 
retain only (37) as a significant solution. Finally we note that (37) is 
identical with (39). Hence: 

In physio-chemical problems, where x = cos 6, the only solution of 
Legendre^s equation which is of practical interest is P;(cos d). 

Problems. 

a. Prove that, when I is an even negative integer, the expressions (35) and (38) 
become identical. 

b. Prove that, when 21 is an odd negative integer, expressions (37) and (38) become 
identical. 

Differential Equation for Associated Legendre Functions, or Associated 
Spherical Harmonics. 

An equation similar to Legendre^s plays a considerable r61e in mathe- 
matical physics. It is® 

(1 _ x^)y" - 2xy' + ^1(1 + 1) - y = 0 (2-41) 

where I and m are both integers, and has a particular solution: 

djn 

y = (2-42) 

The other particular solution is related to Qn and is rarely of interest in 
applications. To construct (42) by the method of series integration is 
perfectly feasible, but we shall here use a simpler method based on the 
foregoing results. If Pi(x) is a solution of 

(1 - x^)y" - 2xy' + l{l + l)y ^ 0 

then 

cT 

® Ihe equation occurs more commonly in the equivalent forms 

or 

sm 0 \ dd/j L ^ sin^ ^ 


which reduce to (41 ) on substitution of cos 6 ^ x. 
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for which we shall write (x), satisfies the equation 

(1 - x2)P,'’">" - 2(m + l)xP|’"^' + [1(1 + 1) - m(m + = 0 

(2^3) 

as is seen when Legendre's equation is differentiated m times. Now let 

Pl”'\x) = (1 - 3^yy (2^) 

and determine, by substituting this into (43), what differential equation y 
will satisfy. After substitution, (43) will read 

(1 — (4r^x^ — 2r — 2rx^)y — 4r(l — x^)xy' + (1 -■ x^^y" 

— 2(m + 1)(1 — x^)xy' + 4r(m + l)x^y + [1(1 + 1) 

- m(m + 1)](1 - x^)y} = 0 (2-44) 

If here the special value r = —m/2 is chosen, this equation reduces to (41). 
We have shown, therefore, that (44) is true with r = —m/2 and hence that 

2 / = (1 - 

as was asserted. The function Pi'^\ which is a polynomial of degree I — m 
and which satisfies eq. (43), is sometimes referred to by physicists as 
Helmholtz^ Junction, The function (42) is known as an associated Legendre 
function, or more frequently, an associated spherical harmonic, 

2.12. General Considerations Regarding Series Integration. Fuchs’ 
Theorem. — Before continuing, the reader will wish to know the limits of 
applicability of the method applied in sec. 2.11, and in particular what 
properties of the solution one may read directly from the differential 
equation. First, then, let us ask the question: Will the method described 
in sec. 2.11 always work? In preparation for the answer, we consider the 
differential equation 

y" + y/x® = 0 

On putting y = Eaxx*"^^ it is seen that 

Lax(« + X)(/c + X - l)x‘+’‘-2 Eaxx^^"* 

X X 

The indicial equation, obtained by putting the coefficient of the lowest 
power of X equal to zero, simply reads 

Oo = 0 

and does not determine k. Furthermore, 

dj+i = -(k +j)(ic +3 - l)aj 

so that Oi = — oo(ic — l)(c. Since Oo = 0, this means that either #c = oo or 
Oj is also zero. In neither case do we get any solution at all. 
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Equally instructive is the equation 



0 


Its indicial equation yields k = 0. The recurrence relation between 
coefficients is 


j(j - 1 ) 


i + 1 


ay 


Thus we have apparently determined a solution. But let us apply a con- 
vergence test. Denoting again the terms of the series by Ur one sees that 


lim 

n— ► ® 





= lim 

CO 


(n — l)n 
n + 1 


X 


nx 


This is greater than 1 as n — ^ oo for every finite value of x, so that there is no 
range of x at all in which the series converges. Again, the method fails. 

To enlarge our outlook, let us now return to the general form of the 
equation we wish to solve, that is, to eq. (27). As a rule there will be 
values of x for which one or both of the functions Xi and X 2 become 
infinite. If a: = Xq is such a value, then xq is said to be a singular point 
of the equation. It is at such singular points that the method of integra- 
tion in series may break down. To be more specific, a solution of the form 
y = may not exist at singular points xq. 

X 

In dealing with Legendre’s equation, a power series development was 
attempted about the point Xq = 0. It succeeded because, after writing 
the equation in the form (27), neither Xi - — 2x/(l — x^) nor X 2 = 

I (I + 1)/(1 ~ ^^) becomes infinite at x = 0. But the points x = ±1 are 
singular points of the equation, and it is for this reason that the general 
solution obtained breaks down at these two points. Again, the two 
equations just considered, y" + x~^y = 0 and y" + x~'^y' == 0 possess a 
singular point at x =0, and this is the cause of the failure of the present 
method. 

But while the method often fails if the differential equation has a singu- 
lar point at the place where the power series development is attempted, it 
does not always do so. For instance, the equation 

y" + x~W - = 0 

may be developed in the form y = despite its singularities at 

X 

X = 0. The indicial equation yieldi(^K = ±1. When the positive sign is 
chosen, the coefficients must satisfy the equation 

[{j + 1)2 - l]Oy = 0 
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which is no longer a recurrence relation but serves to determine the coeffi- 
cients just as well. For it says that every ay = 0, except for j = 0. 

The corresponding solution is y = aox. For k = — 1 we have 

[(j ~ 1)2 - l]ay = 0 

and this indicates that all coefficients must be zero except that corre- 
sponding to J = 0 and to j = 2. Hence the solution is 

y = x'~^(ao + a 2 x 2 ) 

The constants Oq and a 2 are arbitrary, which implies that the solution is a 
general one, including y = const, x as a special case. Obviously, then, it is 
important to settle what kind of singularities do, and what kind do not, 
permit an integration in series about the singular point. 

This issue is settled by an important theorem due to Fuchs, which states 
the following: 

If the differential equation 

y" + X^y' + X 2 J/ = 0 

possesses a singular point at x = Xq, then a convergent development of the 
solution in a power series about the point x = xq having only a finite number 
of terms with negative exponents is nevertheless possible provided that 
(x -- Xo)Xi(xo) and (x -- Xo) 2 X 2 (xo) remain finite. 

This clearly is true for the equation 

J/" + x~V^ — 

at Xq = 0, but not for 

y" + X”2|y' = 0 

Thus the results just obtained are accounted for. The proof of Fuchs^ 
theorem is a matter of some length and will not be undertaken here.® 
In conformity with the theorem singularities in Xi and X 2 occurring at 
X = x^ which are removable by multiplication by the factors (x — Xi) 
and (x — Xt)2 respectively are called non-essential singularities of the 
differential equation; all others are essential ones.^® All regular and non- 
essentially singular points are sometimes referred to as regular points of the 
differential equations (German: “ Stellen der Bestimmtheit ^0* A.n 
equation which has no essential singularities in the entire infinite complex 
plane is said to belong to the Fuchsian class of differential equations. 

^^ee, for instance, Schmidt, H., “ Theorie der Wellengleichung/' Leipzig, 1931. 
Whether the point at infinity is an essentially singular one cannot at once be seen 
in this way. To examine it the transformation ^ = 1/x must be made. One may then 
show that the point at infinity is essentially singular if Xix or become infinite there; 
It IS non-essentmlly singular if 00 or X^x^ 00 ; otherwise it is regular. 
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A final remark on the nature of the solutions obtained by the method of 
integration in series is in order. Even if the point at which the develop- 
ment is made satisfies the Fuchs conditions it may not be possible to obtain 
two independent solutions which, when combined linearly with the use of 
two arbitrary constants, will yield the general solution. If this process is 
to produce a general solution, further conditions must be met. Since 
general solutions are not often required in physical and chemical applica- 
tions, this matter will not be considered in detail here.^^ We note, how- 
ever, that two independent solutions in the form yi == 
and y 2 = can always be obtained when the two roots of the 

indicial equation, ki and /C2, do not differ by an integer or by zero. 


SPECIAL EQUATIONS SOLVABLE BY SERIES INTEGRATION 


2.13. Gauss* (Hypergeometric) Differential Equation. — 

(x^ - x)y" + ((! + « + fi)x - y]y' + = 0 (2-45) 


The parameters a, /3, y are constants, and it will be assumed that y is not 
an integer. Eq. (45) has singularities at 0, 1, and oo , but they are all non- 
essential. On development about x = 0, the indicial equation reads 

k(k — 1) + ky = 0 


hence /c = 0, 1 — 7. 


Choosing #c = 0, we obtain the recurrence formula 


(a+j)(P + j) 

“ (j+i)(j + y) 


(2-46) 


and hence the particular solution 


y = a |l + 


ap . a(a + 1 ) P(p + 1 ) , 

r X H : — : — r-rt — x" 


+ 


a(a + 1 ) 


y 1 • 2 • y{y + 1) 

(a + r - 1) • /3(0 + 1) 


+ ... 

(/3 + r - 1) , 


r! y(y + 1) ■■■ (y + r — 1) 


x’‘ + 


(2-47) 


The series in {} is known as the hypergeometric series. It converges if 
I X I < 1. For a = 1, /3 = 7 it reduces to the ordinary geometric series; 
hence its name. It is customary to denote the hypergeometric series by 
F{a,P,yfl). With this abbreviation, then, this particular solution is 

y = aF{a,P,y]x) 


Next, we take k = 1 — y. The recurrence relation reads 


«y+i 


(g — 7 -b j 4- 1) (/3 — 7 -b j -f 1) 
(i + l)(i + 2-7) 


(2-48) 


** For particulars, see B6cher, M., “Regular Points of Linear Differential Equations 
of the Second Order," Harvard University Press. 
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When the new constants: a = a — 7 + 1, | 8 ' = /3 — 7 + 1, 7 ' = 2 — 7 , 
are introduced in (48) it becomes 

„ (a'+j)W'+j) 

' 0 + 1)0 + r') 

that is, it takes the same form as (46). The particular solution corre- 
sponding to (48) may therefore be written 

ax^~^F{a — 7 + 1, — 7 + 1, 2 — 7; x) 

We have thus arrived at the following general solution of (45) : 

y = AF(<x,fi,y;x) + Bx^-<F{a - y + I, p - y ->r I, 2 - y, x) (2-49) 

whose range of convergence is | x | < 1 . 

There is an interesting and sometimes useful relation between the 
solutions of Gauss^ and those of Legendre^s equation. Let us introduce in 
(45) the new independent variable given by 

X = 1(1 — {) 

so that it takes the form 

(l-^^)^+ll + « + fi-2y- (a + /3+ 1){]^- = 0 (2-60) 

This reduces to Legendre^s equation (28) if ;we specify the constants to be 
a = Z -f- 1, /S = — Z, 7*1 
One particular solution of Legendre^s equation is therefore 

y = aF + 1, -I, 1; 


From the fact that this solution, expanded in powers of starts with a 
constant term it is clear that it must be identical (aside from a constant 
factor) with (34). In particular, if Z is a positive integer, it must be Pi. 
This happens to be true, as the reader may verify, even with respect to the 
constant factor if Pi is defined as in (39). Thus 

Piii) = f + 1 , -i, 1 ; (2-51) 


An equation known to mathematicians as Tschebyscheff^s results when 
in (50) we specialize the constants as follows : 

a = — iS = n, an integer; 7 ^ 


The equation then reads: 


(1 - e) 


d^y 


{| + n7 


(2-52) 
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Its solution is clearly 

y(i) = AF (n, -n, h + B (n + h-n + ^h 

(2-53) 

The first particular solution here written is a polynomial known as the 
Tschebyscheff polynomial, of degree n. If multiplied by the proper factor 
it has the alternative form: 


Tr^ix) = 2»-i 



1 ! 2 * 


-n— 2 


+ 


n(n - 3) 
2 ! 2 ^ 


n(n — 4)(n — 5) „ « \ 

■■■ 3!^ -V- + ..-) (2-54) 

The function F(a,l3,y;z) reduces to a polynomial when a = —n, 
n being a positive integer, as may be seen from its definition (47). The 
resulting polynomial, which is of degree n, is known as a Jacobi polynomial, 
defined as follows: 

JniVtT,^) = B{-n, p + n, q; x) ‘(2-55) 

It satisfies the differential equation 

(x^ - x)y” + [(1 + p)x — q]y' — n(p + n)y = 0 (2-56) 

in which q must satisfy q > 0. Substitution of a = — n, = p + n, 
y = q into (47) shows that*^ 

Jnip,q,x) = 1 + 

•y ( .nx/A (P + n)(p + n + 1) • • • (p + n + X - 1) ^ 
x-i \X/ q{q + 1) • • • (g + X — 1) 

Problem. Find the solution of (45) about the point x = 1; i.e., find solutions of 
the form 

y = 2^ox(x - 

X 

Ana. 

y = i4F(a,/3,a-h/3-7-Hl; 1 -x) -hB(l 7 -a, 1— a-/3-H7; 1 — x) 

2.14. BessePs Equation. — 

x^y” + xy' + (x^ - n^)y = 0 (2-57) 

n is a constant. Since the equation is regular at x = 0, its solution may 
be developed as a power series about that point. The indicial equation 

(k^ — n*)oo = 0 
Cf. eq. 12— 2 for the definition of 
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has the two roots k = ±n. According to the remarks at the end of sec. 2. 12 
we can obtain two independent particular solutions if 2n is not an integer; 
if it is, the method may allow us to determine only one. Taking k = n 
one finds 


y = oo* 


+ (-ir 

For K = — n 


+ 


2(2n + 2) 2-4(2n + 2)(2n + 4) 

2 • 4 • • • 2r(2n + 2)(2n + 4) • • • (2n + 2r) 


+ 


(2-58) 


y = aoX~” \ 1 -I- 


+ 


+ 


2(2n - 2) 2 • 4(2n - 2)(2n - 4) 


+ 


2 • 4 • • • 2r(2n - 2)(2n - 4) • • • (2n - 2r) 


+ 


(2-59) 


When the constant Oq in (58) is chosen to be'® l/[2’‘r(n + 1)], the resulting 
expression 


- r ^ = f (-1)^ 

^ x-or(X + l)r(X + n + 1)\2/ 


(2-60) 


is called a Bessel function of order n. 

When (59) is multiplied by the same factor it becomes J_n(x). Hence 
the complete solution of Bessers equation (when n is not an integer) is 

y A Jn(x) + (2-61) 


Inspection of (58) and (59) shows that no difficulty arises when n is half- 
integral, although the difference of the roots of the indicial equation is an 
integer. But if n is an integer, t/_n is no longer independent of Jn- For 
in that case the coefficient of in (59) has a vanishing term in the denomi- 
nator, and every subsequent coefficient likewise becomes infinite. Multi- 
plication by the vanishing term makes every term preceding the n-th zero. 
The series then starts with x” and is seen to be identical (except for a 
constant multiplier) with (58). For integral n, therefore, we have obtained 
only one solution, namely Jnix)}^ By choosing the constants A and B of 


The Gamma function appearing here is a generalization of the factorial n! which 
is defined only for integers (and zero). If n is an integer, r(n -f- 1) = nl. In general, 


r(x) 


i: 


e it is easily seen to reduce to nl when x = n. Moreover, this in- 


tegral defines the smoothest function which takes on the values n! at the integers. 
Cf. sec. 3.2. 


The second particular solution for integral n is derived in Forsyth, Dijfferential 
Equations,” Macmillan, p. 182. 
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(61) suitably, several particular solutions of BesseFs equation (such as 
Neumann’s and Hankel’s functions) having useful properties may be con- 
structed. They will be discussed in sec. 3.9. 

2.16. Hermite’s Differential Equation. — 

— 2xy' + = 0; a = constant (2-62) 

The roots of the indicial equation are /c = 0, 1; the recurrence relations 
between the coeflScients 

2{k + i) - 2cc 

(^ +i + 2 )(k + i + 1) ^ 


For /c = 0 we find the solution 


y ^ ao 


(-1 


X* + 


2^a(a - 2) ^ 2^a(a - 2) (a - 4) 


+ (- 2 ) 
while for #c = 1 
j/=aox(^l 

+ (- 2 ) 


4! 

T2r)! 


X’ - 


r «(« - 2) • • • (g — 2r + 2) 2 r 


6! 
x^’’ + ■ 




■) 


(2-63) 


1) 2 , 2=^(c» - l)(a - 3) , 


■x^ 4- 


5! 
a 

(2r + lyr 


X — 


r (a — l)(a — 3) • • • (g — 2r + 1) - 


+ 


) 


(2-64) 


The general solution of Hermite^s equation is a superposition of these. If 
a is an even integer n, (63) reduces to an even polynomial of degree n. 
On choosing for oq the value 


(-l)n/2 



this polynomial becomes 


Hnix) = (2x)’‘ - (2x)»-2 


n(n — 1)(^ — 2)(n — 3) 
+ 2! 


(2x)^“^ 


(2-65) 


and this is known as the Hermite polynomial of degree n. 
integer, n, (64) reduces to an odd polynomial of degree n. 
choose for Oq the value 


1)/2 


2-n! 

C-i")' 


If a is an odd 
In fact if we 


that particular solution also takes on the form Hn(x), 
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An equation very similar to that of Hermite is 

y" + (1 - + 2a)y = 0 (2-66) 

For if we put y = so that y” = { (x® — l)i> — 2xo' + 

the equation turns into 

— 2xv' + 20 ;?; = 0 


which is identical with (62). Hence the solution of (66) is simply any 
solution of Hermite’s equation, multiplied by 

2.16. Laguerre’s Differential Equation. — 

xy" + (1 ~ x)y' + ay = 0; a = constant (2~67) 


has a non-essential singularity at the origin. Developing about a; = 0, 
the indicial equation has the single root /c = 0. Only one solution will be 
obtained, this being of considerable importance in physics. The recur- 
rence relation- reads: 

" (i+ 

hence 


y 


= ao 


— aa; + 


jj 3.2 
(2 !)^ 


+ (-l) 


r Q^(« — 1 ) • • ♦ (g — r + 1 ) 
(r !)2 




) 


( 2 ^ 8 ) 


This expression becomes a polynomial when a = n, a positive integer. On 
putting 

ao = (-l)"nl 


and for integral n, y becomes the Laguerre 'polynomial of degree n: 

Ln{x) = (-1)" ^a:" - ^ -I 

+ (-l)’’n!^ (2-69) 


A differential equation at once reducible to Laguerre’s is 

xy'' + (fc + 1 — x)y' + (a — k)y = 0, fc an integer > 0 (2-70) 

It results when (67) is differentiated k times and y is replaced by its fc-th 
derivative. Hence a solution of (70) for integral and positive a and k is 

1/ =^L„(x) *L*(x) 
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This is sometimes called the associated Laguerre polynomial of degree 
n — k, 

A third function closely related to the Laguerre polynomials satisfies 
the differential equation 

xy + 2y + l^n - — - -^J y = 0 (2-71) 

If we substitute in this equation y = then v is seen to be a 

solution of 

+ (fc + 1 — x)v' + (n — k)v = 0 

Compariscm with (70) shows, therefore, that v = Ln(x), Hence a particu- 
lar solution of (71) is 

y = (2-72) 

This function is known as an associated Laguerre function; it is of great 
importance in the theory of the hydrogen atom. We observe that if n in 
(71) were not an integer but any constant a, the corresponding solution of 
(71) would be 


where La is written for the series (68); provided, of course, that fc is a 
positive integer. This solution would no longer be a polynomial in x 
multiplied by but an infinite sequence. 

2.17. Mathieu’s Equation. — In the previous sections attention has 
been given to differential equations in which X\ and were algebraic 
functions of x. Equations sometimes arise in which these functions 
are periodic. The simplest instance of these is Mathieu's equation^ usually 
written in the form 




+ (a + 166 cos 2x)y = 0 


(2-73) 


where a and 6 are constants. Its general solution may be obtained by the 
method of integration in series if the substitution 

{ = cos^ X 


is made. (73) then reads 


41(1 + 2(1 - 21) + (a 


166 + 32H)y = 0 (2-74) 


Defined by eq. (27). 
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This equation has a non-essential singularity at f = 0 and can therefore be 
developed as a power series about the origin. On inserting 

y = 

X 

in (74) we obtain 

22:(ic + X)(2 k -1- 2X - - 2:[4(<c + X)* - o -|- 166]axr+>' 

X X 

+ 326i:axr+’^+‘ = 0 

X 

Here a feature arises which was not encountered before; the equation 
contains three different summations instead of two and will therefore lead 
to a three-term recurrence relation between the coefficients a\ instead of the 
two-term relations that occurred in the former instances. This, however, 
requires no modification of procedure, except that it will force us to advance 
step by step in the computation of the coefficients. Only the first summa- 
tion can contribute to the coefficient of f ^ which must be zero. Hence 
the indicial equation is formed as before: 

k ( 2 k - 1) = 0 

whence we obtain the two choices: #c = 0, Next, we equate to zero 
the coefficients of to which the first and second summations contribute. 
This leads to 

2 {k + l)(2x + l)ai = (4 k 2 - a + 166)ao 

so that 

j 4x^ “ a 4- 166 
" 2 (^+ i)(2k-M) 

from which ui may be determined when the arbitrary constant ao is 
avssumed. On equating to zero the coefficient of to which all three 
summations contribute, one gets 

2{k + 2) {2k + 3)a2 - [4(/c + 1)^ - a + 166]ai + 326ao = 0 

a relation permitting the calculation of a 2 , etc. In this way two series can 
be cdnstructcd, one for x = 0, the other for x = ^, linear composition of 
which yields the general solution of (74) and hence of (73). Investigation 
shows that this solution converges if | { ] <1. 

This general solution, however, is rarely of interest in physics and 
chemistry, for it is not periodic in x. In most problems leading to 
Mathieu^s equation, x is an angle, so that there is no significant distinction 
between x and x + 2n7r, where n is an integer. Thus the solutions usually 
sought must have the property that y{x + 27m) = y{x). The general 
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solution here found, which is of the form 

(2-75) 

X X 

does not possess this periodicity, as closer investigation would show. 
Qualitatively this defect is apparent from the failure of the solution to 
cooverge for f = ±1, which excludes the values x = rnr from considera- 
tion altogether, as well as from the existence of a branch point of (75) 
at { = 0 (arising from the factor 

In fact it is impossible to obtain solutions of Mathieu^s equation which 
are periodic and of period 27r in x, unless definite restrictions are placed 
upon the constant a. It turns out that the latter must be a complicated 
function of 6 if the solution is to be periodic.^® 

FloqueVs Theorem. An important theorem concerning the general 
solution of Mathieu’s equation, or indeed of any linear differential equation 
with periodic coefficients which are one-valued functions of x, will now be 
established. Suppose that ^i(x) and y 2 (x) are two linearly independent 
solutions of (73), so that any particular solution y may be compounded 
from them by means of two constants Ax and A 2 as follows: 

y = Aiyi + A22/2 (2-76) 

Now it is clear that, if yi(x) and y 2 {^) are solutions of (73), yi(x + 27r) 
and y 2 (x + 27r) will also be solutions, for the substitution of x + 27r in 
place of X causes no change in the differential equation. This must, of 
course, not be interpreted as implying that yi (x + 27r) = yi (x) and 
y 2 (x) = y 2 (x + 27r); but it does mean that 

yi(x + 27r) = ociiyi(x) + ai 2 y 2 (x); 2/2 (^ + 27r) = oi 2 xyi(x) + ot 22 y 2 {x) 

the a^s being constants. Similarly, using (76) 

y(x + 27r) = Aiyi(x + 27r) + ^22/2 (a? + 27r) 

* {Aiaii + A2a2\)yi{x) + (Aiai2 + A20t22)y2{x) 

We observe that the constants a are fixed by the choice of y\ and 1/21 but 
Ax and A2 may be chosen at will and still leave y a particular solution of the 
equation. It is possible to choose them so as to satisfy the equations 

Aiaii + A. 2 CL 21 = hAx) Aiai2 + A2a22 = ^^2 (2-77) 

where A; is a constant not within our control, for if eqs. (77) are to be satis- 

Cf. Whittaker and Watson, “ Modem Analysis,'* for further details regarding 
periodic solutions. 
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fied then k must be subject to the equation 
an — k a2i 


0 


Oti2 a22 ~ ^ 

But if (77) holds then 

y{x + 27 r) = k[Aiyi(x) + A2y2{x)] = ky{x) 


(2-78) 


(2-79) 


In other words, there exists a particular solution y(x) such that, when x is 
increased by 27r, the solution itself is multiplied by the constant k. If k 
were unity, this solution would be periodic. 

This result may be expressed in a different way. On putting 

k = ^ e^^P(x) 

eq. (79) reads 

e^(*+2^)p(x + 2^) = 


so that P{x) turns out to be a periodic function. Thus it is seen that there 
exists a particular solution of Mathieu’s equation of the form 

y = e^^P{x) (2-80) 

where P is periodic. From here it is only a simple step to obtain a general 
solution of (73). The differential equation is insensitive to the substitu- 
tion of —X for X. Hence e~~^^P{ — x) must also be a solution. Moreover, 
it is an independent solution, since it is not a constant multiple of (80). 
The complete solution is, therefore, a linear combination of these two: 

y = Cie^'^Pix) + C 2 e~^'^P{—x) (2-81) 

This result, known as FloqueVs theorem, is of interest in some astronomi- 
cal applications and chiefly in the quantum theory of metals. 

Problem. Show that the Schrodinger equation 

^ + [A + F(x)]^ = 0, 

in which A is a constant, and F is a periodic function of x such that F (x + Z) « F (x), 
has solutions of the form 

^ 

where vis also periodic: v(x + 0 = v(x). 

This is sometimes called Bloch’s theorem.^® 


See Seitz, F., Modern Theory of Solids,” McGraw-Hill Book Co., New York, 
1940, Chap. VIII. 

18 Bloch, F., Z, /. Phys. 62, 555 (1928). 



2.18 


ORDINARY DIFFERENTIAL EQUATIONS 


82 


2.18. Pfaff Differential Expressions and Equations. — The equations of 
thermodynamics are peculiar inasmuch as they usually occur in the form 

dTT = i: Xxdxx (2-82) 

x=i 

where the X\ are functions of some or all the independent variables x\. 
While (82), which is known as a Pfaff expression^ is not a differential equa- 
tion of the customary kind, its importance in chemistry and physics requires 
consideration. It is for lack of a more adequate place that this material is 
inserted in the chapter on differential equations. Some of the material 
which will be developed from a mathematical point of view in this section 
has already been used in Chapter 1, to which reference should be made for 
further applications. The equation 

X\dx\ == 0 
x=i 


is sometimes called a total differential equation or, more generally, a Pfaff 
equation. 

Clearly, the expression dWy eq. (82), can be integrated along any path 
in n-dimensional space, but the integral will in general depend on the path 
of integration. (See Prob. a, p. 87; also the example in sec. 1.8.) 


When J * dW depends on the path of integration, it is said to be incomplete 


or inexact. 

The condition that (82) be a complete differential is 


dW = df {XiX2 ' • • Xn) 

pH 

for then I dW = /(r 2 ) — /(rj), independently of path. Now 

df = Z-^ dx, 

X dxx 


(2-83) 


Comparing with (82), we find 



To state this relation without explicitly introducing the function /, we 
differentiate it with respect to x^, n ^ 


dx^ dx\dx^ 
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But also 

dxx dx^dx\ 


Hence the necessary condition of “ exactness ” may be written in the form 


dx^ 


dxx ’ 


X, (u = 1 • • • n 


(2-84) 


The reader who is already familiar with vector analysis will note that, if 
the X\ are interpreted as components of a vector R, (82) may be written 

dW =RdT (2-82') 


and the condition of “ exactness becomes 

dy dx dz dx dz dy 
or 

V X R = 0 (2-84') 

These results are of importance in vector analysis where they are usually 
expressed as follows: The condition that the line integral of R (expression 
82') around any closed curve shall vanish is that R be the gradient of some 
scalar function, and this is equivalent to condition (84'). (Cf. sec. 4.17.) 
We return now to the general situation : 

dW is not exact 

and distinguish two cases: 

A The equation dW = 0 has a solution. 

B The equation dW = 0 does not have a solution. 

A. The equation dW = 0 possesses a solution. Leaving aside for the 
moment all considerations as to when such solutions may be found, we shall 
first sketch the consequences of the existence of solutions. The equation 
dW = 0 assigns to every point a direction, or, what amounts to the same 
thing, an element of surface, (PYom the point of view of vector analysis 
this is immediately clear because the relation R • dr specifies at every point 
(xi • • • Xn) the direction dr which is perpendicular to the vector R.) 

When integrated, the equation dW = 0 leads to 

(t>{xiX 2 * • • ajn) = c (2-85) 

which represents a one-parameter family of surfaces in n-dimensional 
space. These surfaces consist of the elements specified by dW = 0. 

We now wish to show that there exists an integrating denominator, 
t{xi • • * Xn), such that dW /t is an exact differential. The proof is as follows. 
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Along the surface <t>(xi • • • Xn) = c (cf. Fig. 2-6), we have both « 0 
and dW =* 0. The same is true along a neighboring surface 0 = c + dc. 
Suppose we wish to go from A to C. The change occurring in is dc, 
no matter whether the crossing is made at Bi or at B 2 * But the change dW 
will depend on the path. The important point to note is that no change 
occurs in TT as we pass along either curve; a change can occur only at the 
crossing: dW = function of the point at which the crossing is made. (If 
dW ^ 0 along the two curves, then it would depend on the whole path, not 

merely on the point of crossing!) 
Hence dW = t{B)d<i>y where B is 
the 'point of crossing. Hence dW = 
t(xi • • • Xn)d<t>y or 

, dW 

But d0 is an exact differential. 

Along the curves = const., the equation F(<i>) = const, will likewise 
be satisfied if F represents a unique, single-valued function. If, then, we 
use F{<l>) in place of ^ in the preceding analysis, we are led to 

dW . ^ ^ ^ dW 

dF = — mstead of « — 

1 t 



Since, however, dF = (dF/d0)d<^, we see that T = tl(dF/d(f}) is also an 
integrating denominator. It is clear that, if there exists one integrating 
denominator t for a Pfaff expression, an infinite number of others can be 
formed by the above rule. 

Only the points on the surface 0 =» c are connected with A by paths 
along which dW =0. It is clear that in the neighborhood of A there is an 
infinite number of points not connected with A by such paths. Hence the 
fact, important in thermodynamics (though somewhat trivial geometri- 
cally!): 

If the inexact differential dW possesses an integrating denominator t, then 
there exists in the neighborhood of every point F, innumerable points which 
cannot be reached from P along paths for which dW == 0. 

We now consider the question of how to find the integrating denominator, 

1. Case of two variables. First solve the equation 

dW = 0; Xdx + Fdy « 0 (2-86) 

The solution is 

y = fix,c), or 4>(.x,y) - c (2-87) 

Along the curves (87), + <t>ydy - 0, hence 

dx 4>v 


( 2 - 88 ) 
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But from (86) 

so that 


dz 


X 

Y 


<f>x X <t>X 4>v /X 

♦ “ X -y - “(*■»> 


Now 


Hence 




dW 


<l>xdx + <t>ydy = uXdx + uYdy =* iidW 


e ~ ~ — 

U <f>x 

by (89). 

2. Case of three variables. First solve 

dW =0; Xdx + Ydy + Zdz ^ 0 

The solution is 

<t>(x,y,z) = c 

Along these surfaces, <t>xdx + <t>ydy + = 0, hence 


dx 

But from (91) 
dx 

Hence 


or 


Now 


d4> 


t 


Therefore 


<t>x 

dz 


<t>z 

dz 

M 

1 

dx 

V 


dy 

X 

dz 


X 

dz 


lx 

V 

z 

dy 


X 

i>x 

X 

4>v 

4>v 

Y' 

<I>M " 

Z ’ 


II 

<t>y 

Y “ 

4>, 

Z 

u(x,y,z) 


■ My 

(l>gdz = 

u(Xdx 

/ 

1 

X 

Y 

z 


u 

<t>z 

<t>v 

<l>* 


Y 

Z 


Similarly for more than three variables. 


(2-89) 


(2-90) 


(2-91) 
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We now consider the condition that the equation 

dW =0 


shall have a solution. {Condition of integrability.) 

Suppose a solution of '^X\dx\ = 0 exists in the form 

A 

d>{Xi ■ ■ - Xn) = C 

Then 

&<^ 

uiXi • • • = •— , t = 1, 2, • • • 71 


Let ij kj be different indices. 


whence 


jd_ 

dXi 


(uXj) 


It follows from (92) that 
d 

dx^dXj dXj 


Similarly, 



(2-92) 


Multiply the last three equations by Xy, and X^, respectively, and add* 


X, 


dXj 




dXk ) 


= 0 


(2-93) 


By closer analysis, this equation may be shown to be both necessary and 
sufficient; it represents the condition of integrability for the Pfaff equation 
dW = 0. In three variables, eq. (93) takes the form 


R . V X R - 0 


provided R is interpreted as the vector having components Xi, X 2 , X3. 
The total number of equations of the form (93) is equal to the number of 
triangles that can be formed with n given points as corners ; it is therefore 
\n{n — l)(n — 2). These equations are therefore not independent. 

It is to be observed that, in the case of two variables, eq. (93) is always 
satisfied. Hence every Pfaff equation of the form 

Xdx + Ydy = 0 


possesses a solution. 
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B. The equation dW = 0 does not possess a proper solution, i.e., eq. 
(93) is not satisfied. For simplicity, we consider only the case of three 
variables, where the solutions can be visualized easily in ordinary space. 
Generalization to more variables introduces no complications. It will be 
seen that improper ” solutions of eq. (82) are still possible, but that they 
represent a greater variety of functions than the proper solutions considered 
in the preceding paragraphs. 

We now choose an arbitrary relation 

yp{x,y,z) = 0 (2-94) 

and impose this upon eq. (82), thereby effectively eliminating one degree of 
freedom. From (94) and its differential form 

^xdx -f = 0 

the variables z and dz are obtained in terms of x, i/, dx, dy, and these solu- 
tions are substituted in eq. (82). It will then be of the form 

Xdx + Fdy = 0 

and this has a solution 

<h(x,y) = 0 (2-95) 

The improper solutions of (82) are said to be those curves which satisfy (94) 
and (95) simultaneously. They represent, therefore, prescribed curves 
upon arbitrary surfaces. Further investigation would show that every 
point in the neighborhood of a given point can be reached by a continuous 
curve satisfying (94) and (95) from the given point, the state of affairs being 
quite different from that described under A. 


Problem a. Let dW -= x{dx -|- dy), 
paths : 

1 . xij/i -♦ X21/1 -♦ X 2 yi. 

2 . ziyx -♦ xi 2/2 2 : 22 / 2 . 


Compute the integral 



dW along two 


Show that the two results differ by the area enclosed by the two paths of integration. 


Problem b. Show that the expression 


dW = —ydx -f xdy -f /cd* =» 0 


where /c is a constant, does not possess an integral. 
See Bom, M.. Phynk. Z, 22 , 250 (1921). 



CHAPTER 3 
SPECIAL FUNCTIONS 


3.1, Elements of Complex Integration. — In the present chapter the 
more common functions appearing in physical and chemical theory will be 
listed and their chief properties will be described. It will be assumed 
that the reader is familiar with the simpler notions of the calculus of com- 
plex variables, in particular with the meaning of the Argand diagram or 
complex plane. As to notation, the symbol x will be used for a single real 
variable, while z denotes x + iy = We shall also assume without 

proof the famous theorem of Cauchy which states that, Uf(z) is an analytic 


function of 2 in a certain region including the point z 


f 


a, and if f denotes 


the line integral along a closed contour within this domain taken around the 
point a in a counter-clockwise sense, then 


and 



(3-1) 


^( 3 - 2 ) 


From these two equations it is possible to derive the theorem of residues, 
which will now be stated. Suppose that the function /fz) can be expanded 
in the neighborhood of the point z = 20 in the form 

m-f-l , , 1 , _ , _ _ \ . 

■ ■(* -■ + (* - • 
where m is some finite integer.^ Then 

^f(z)dz » 27r£au-i (3-3) 


provided the integral is taken counter-clockwise about the point 20 . The 
coefficient a_i is said to be the residue of the function f(z). As a generaliza- 
tion of (3) we note that, if the contour of integration includes other poles 

^ When this expansion is possible, / (2) is said to have a pole of order m at 20- 

88 
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at which the function has residues c-i, • • •, 

^ f(z)dz = 27 ri(a_i + i + + • • •) 

These theorems are proved in books on complex variables. 

Example. To evaluate the integral 


(3-3a) 


-r- 


-f b cos <t> c sin 4> 


Let z « 0 = — i log z^ d<t> - —i{dzlz). Then cos 0 = i(2 -f * ^), sin 4> 

(l/2i){z-z-^). 


J az -h J (2^ + 1) -f ” - 1) 

2 2i 

the contour being the unit circle about 0. The denominator of the integrand may be 
written 

J(6 - tc)«* + < 1 * 4- ^(b + tc) = i(b - tc) 1^2 - ^ (-0 + 


provided we put 


K a* — h* — c* > 0 then 


- 62 - c* s ft 


ri-a + R) <1 

b — ic 


The other root > 1 and lies outside the unit circle. The residue of the integrand at 
2 = (— a + ft)/(6 — ic) is 


i(b - ic) r . - r (-0 + K) - . : (-<» - ^)1 

Lb — tc b — tc J 


Therefore 


/ - -t • 2irt(a* - b® - c*)-i'* 


Vo* - b* - c* 

3.2. Gamma Function. — The gamma function is a generalization of 
the factorial n! for non-integral values of n; more specifically, r(z) is so 
chosen that, if n is an integer, r(n) •» (n — 1)!. A fundamental defini- 
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tion, due to Euler, states 

r(z) = lim 


1 • 2 • 3 • • • (n - 1) 


^ ( 3 - 4 ) 

® 21(2 + 1) • • • (z + n — 1) 

Several important properties of the T-function follow at once from this 
definition. Since from (4) 

1 • 2 • • • (n - 1) 


r(z + 1) = lim 

n— ► w 

T{z 4- 1) = lim 


(z + 1)(2 + 2) • • • (2 + n) ^ 
zn 1 • 2 • • • (n — 1) 


1+1 


{z + n) z(z + 1) • 
On the other hand, (4) also shows that 


(z + n - 1) 


n! 


r(l) = lim -- - 1 
— n! 


n* = zr(z) (3-5) 


(3-6) 


From (5) and (6) it is at once apparent that, if n is a positive integer, 

Tin) = (n~ 1)! (3-7) 

as was stated above. It is also evident from the definition (4) that r(z) 
becomes infinite at z =0, — 1, —2, etc., and that it is an analytic function 
everywhere else. 

It is often useful to represent r(z) by means of a definite integral. To 
achieve this, we consider the function 




(3-8) 


wherein n stands for a positive integer, and the real part of z is taken to be 
greater than zero in order to insure convergence of the integral. The 
transformation t = t/n converts F into 

F(z,n) = r (1 - T)^T*-^dT 
Jo 

The integral appearing here may be evaluated by repeated partial inte- 
grations: 

r\l - T)’*T*-‘dT = [(1 - t)"-1 + - r (1 - r)”-Vdr 
Jo L 

The integrated part here vanishes at both limits, and the remainder may 
again be subjected to a partial integration, yielding 

^|[(i - Xnl + 
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The integrated part is again zero. By continuing this process we find 
n(n - 1) • • • 1 . , 1 • 2 • . . n 

F( 2 ,n) 


' f = 


z{z + 1) • • • (z + n - 1) Jo z(z+l) •••(z + n) 

As n approaches infinity, this expression becomes identical with (4); hence 

lim F( 2 ,n) = T{z) (3-9) 

n— ►« 

On the other hand, since e = lim (1 + 1/p)^ and therefore 

p — ► 00 

lim (1 + 1/p)^^ = lim (1 + x/n)^ 

px — ► 00 n 00 

the quantity (1 — t/n)^ appearing in (8) approaches the limit 6""^ We 
conclude, therefore, that in view of (8) and (9) 


f = r(z) 

Jn 


(3-10) 


This result is valid, we recall, when the real part of z is greater than zero. 

A definition of the F-function, or rather its reciprocal, by means of an 
infinite product has bccui given by Weierstrass, Since it is a useful one, we 
shall here derive it by simple steps (the rigor of which is not always obvious) 
from Euler’s definition (4). We first note that the product 


1 1 


n — 1 


z z + 1 2 + 2 z + n— 1 

2 n-l 

which appears in (4), may be written - 11 (1 + z/m) so that (4) be- 

z 


comes 


or 


r(z) = - lim n‘ 
Z 


n(i + ^) 

1 \ m/ 

— = z lim n~*n( 1 + — ) 
r(2) n— » 1 \ m/ 


If we multiply the right-hand side of this equation by unity in the form of 




lim 




we obtain 

r(«) L- 


lim 


T lim n(i 

JLn— » 1 \ m/ J 
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Now the infinite series: lim (1 + ^ + • • • 1/n — log n) = C converges; 

n—+><» 

it has the value C *= 0.5772 • • • , known as the Evler-Mascheroni constant. 
Hence 

which is the Weierstrass definition. It shows, again, that V(z) has poles 
at 2 =0, —1, —2, etc. 

A further important property of F-functions, namely the relation 


T(z)T(l ~ 2 ) = 

sm TTZ 

is readily derived from the Weierstrass definition, 
theorem: 



(3-12) 

First, we recall the 
(3-13) 


which may be proved by an expansion of the infinite product as a sum of 
powers of z^, (The details are left as an exercise for the reader.) From 
( 11 ), 





TT 

z sin wz 


(3-14) 


the last step because of (13). But in view of (5) 

T(-z) = - lr(l - s) 
z 

and this, when inserted in (14), yields (12). 

Several other formulas for the derivation of which the reader should 
refer to mathematical treatises,^ will now be listed without proof. 

r(z)r(z + i) = (3-15) 

An infinite product of the form 

1— a 2— a 3— a 
' 2 ^ ■ 3 ^ 

may be expressed ip terms of r-functions: 

n — o _ r(l — 6) 

I n — b r(l — o) 

* Cf for instance, Whittaker and Watson, p. 235. 


(3-16) 
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Also, 


^ n{a + b + n) _ r(a + l)r(b + 1) 
1 (a + n)(b + n) T{a + 6 + 1) 


(3-16a) 


If m and n are positive constants, not necessarily integral, we have 



This relation may be modified as follows. Put m 
introduce the new variable of integration cos^ x = u 
integral will then be converted into 


/' 


— I 


(1 — du 


= 2r, n = 2s, and 
on the left. The 


which is a function of r and s known as the Eulerian integral of the first kindy 
or simply the 5-function, and denoted by 5(r,s). Eq. (17) may therefore 
be put in the form 


5(r,s) 


r(r)r(s) 
r(r + s) 


(3-17') 


The logarithmic derivative of the P-function is given by 

Imrw.XXT-rS)* 


if X = real part of z > 0, as was shown by Gauss. 

From this result it is possible to obtain an expression for In T (z) which 
is useful in evaluating r(z) for large values of z: 

In r(2) = (z — I) In 2 — z + ^ In (27r) + 0 (3-19) 


where 0(l/x) represents a series of terms which vanish for large z at least as 
strongly as 1/x. For real z, (19) takes the form of Stirling's series, when 
written for P instead of its logarithm: 


P(x) = 

e'“^x^i/2(27r)^^2 




1 

288x2 


139 571 

51840x2 + 2488320X* ' ' 


(3-20) 


It is valid when x is large. This expansion may be used for the approximate 
evaluation of factorials of large numbers: 

N\ = iVr(A^) = e-^N^(2irxYl^il + • • • 


(3-21) 
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In concluding, let us compute a few numerical values of the F-function. 
It has already been noted that 

r(o) = 00, r(i) = 1 , r(2) = i, r(3) = 2!, r(4) = 3! etc. 

If the values of T{x) in the interval 0 < a: < 1 a,re known, T{x) can be 
computed for all real positive x by means of (5). r(^) is easily obtained 

from eq. (10): 

r(^) = f = 2 f e~^^dx 

if x^ is written for t. Hence r(§) = Vtt. The same result could have 
been obtained by putting z == ^ in (12). Tabulated values of the F-fimc- 
tion for real arguments may be found in Jahnke and Einde.^ The qualita- 
tive behavior of r(a:) is plotted in Fig. 1. 


Problems. 

a. Prove eq. (13) by expanding the infinite product. 

b. Prove eq. (17') directly. Hint: Express r(r)r(s) as a double integral in 
accordance with eq. (10). Next, put the two variables of integration, respectively, equal 
to and and then transform to polar coordinates. The radial integral will be 
r(r + «)» the remainder B(r,s). 

c. Show that B{r,s) = B{r + 1 , 5 ) -f B(r,5 + 1). 


3.3. Legendre Polynomials. — Of the solutions of Legendre equation 
(2.28), the functions denoted by Pi(x) in sec. 2.11 are of greatest interest 
because they remain finite at a; == ztl. In physical problems, the argu- 
ment of Pi is usually the cosine of an angle and has therefore the range 
““1 S a: ^ 1. Pi is definite at the endpoints of that range; the other 
solutions are not. Hence the present discussion will be restricted to the 
polynomials Pi. We repeat their definition: 


Pi(x) = 

(20! j , _ l(l - 1 ) 1-2 , 1 )(^- 2 )^- 3 ) 

2^(l\)^\ 2(21 - 1 ) 2 • 4 ( 2 / - 1 )( 2 / - 3 ) 


(3-22) 


An interesting representation of P„ is easily established. When the 
function 

F{x,y) = (1 - 2xy + y 2 )-i /2 ( 3 ^ 23 ) 

is differentiated n times with respect to y and y is then put equal to 
zero, the result is seen to be l\Pi{x). Hence if F(x,y) is expanded in a 

® Jahnke, E., and Emde, F., '' Funktionentafeln,” Teubner, Leipzig and Berlin, 
1909. 
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becomes 

F(x,y) = (1 - 2a:j/ + j/2)-i/2 = j:Pi{x)y^ (3-24) 

i-0 

This relation has meaning only when the right-hand side converges. 
Suppose that | x | ^ 1 which, as pointed out, is the case in most applica- 
tions. Pi{x) will then also lie between 1 and —1, for the definition of Pi 
shows that every 

PKD = 1 

and that | Pi(x) | < 1 for | x | <1. Thus the coefficients of in (24) 
are never greater, in absolute value, than 1, and the series converges when 
2 / < 1 . 


P 



Fia. 3-2 


Theorem (24) is of interest in the calculation of the potential due to two 
equal and opposite electrical point charges. (Cf. Fig. 2.) The potential 
energy at P is 



Now 

Pi = =F 2Rd cos d 

On identifying =F cos 0 with x and d/R with y in (24) one obtains 

V (3-25) 

Since Pi{x) is an even function of x if Hs even, an odd function if I is odd, 
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the even powers of d/R will cancel from (25), the remaining terms being 

V ^2^^ {cose)\^-J (3-26) 

This expansion is valid if d/R < J , that is, if the distance of the point P 
from the center of charge is greater than the separation of the charges. 
If df only the first term in (26) is of importance, so that V reduces to 
the familiar dipole potential 

TT M cos 0 

^ (cose) = — ^ 

where m, the dipole moment ” is written for 2gd, and note has been taken 
of the fact that Pi(x) = x. The higher terms of (26) are to be retained 
when R is comparable to d. When P < d, we may write 

and this leads to 

F = ^ L P 2 i+r (cos e)(^- j (3-26a) 

The method here illustrated may also be used to calculate the potential 
arising from an arbitrary distribution of point charges. 



From the foregoing results one can derive a useful expression for Pi. 
Let r be a vector extending from the origin, Az an increment in the «*direc- 
tion, and R = r + Az. (Cf. Fig. 3.) If then we express 

i = [r^ + (Az)^ + 2rAz cos 
R 

1 r, , 2i« , ^ /izVT" 
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by means of (24), putting — cos 0 = x and Az/r = y, we have 

1 1 " /Az\' 

-=-2 P,(-cos«)(-) (3-27) 

R r 1=0 \r / 

On the other hand, if 1/R = 1/| r + Az | is expanded in a Taylor series 
about R = r the result is 


1 

R 


1 d 

= - + Az - 
r dz 



+ ••• + 


(Az)^ /1\ 

Z! dz\r) 


+ ■■■ 


(3-28) 


On comparing the coefficients of (Az)* in (27) and (28) it is seen that 

1 


-i+i 


Pi(— cos d) 


ll dz^\r/ 


Since Pi (—x) = ( — l)*Pi(x), this is equivalent to 


f — iV 

Pi (cos e) = — — r*+* 

1 1 dz' 


-■(c) 


(3-29) 


In using this relation it is understood that cos 6 - and = 

. 

Another expression for Pi involving an l-th derivative, and in some sense 
simpler than (29), is known as Rodrigues^ formula. To obtain it we observe, 
first of all, that 


(x2- D* = z (-1)^ 


Z! 


X!(Z - X)! 


-2(I-X) 


in accordance with the binomial theorem. When this expression is differen- 
tiated Z times, there results: 

Z! 


d' , , 
dx* 


D' = 'E(-l)*‘- 
^ X ^ ^ X!(Z 


_ (2f-2X)! 

X)! (Z - 2X)! 


the summation extending over all integers X including 0 until X equals 
either \l or ^(Z — 1). The right-hand side of this equation may be written 




l{l - 1) 

2(2Z - 1) 


J-2 



Hence, from the definition of Pi, eq. (22), 

1 d* 


Pi(x) = 


2*Z! dx* 


(X^ - 1)' 


(3-30) 


which is the formula of Rodrigues. 

An integral representation of Pi is due to Schlaefli. It can be derived 
by combining eq. (30) with the fact expressed by eq. (2). If the latter 
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equation is differentiated I times with respect to the argument of /(a), the 
result may be written 



m 

(z - 


dz 


Now choose /(x) to be (x^ — l)^ so that 


dx' 


(»’ - 1)'. - 


2inJ (z — x)*'*"* 


On comparing this with (30), it is seen that 


Piix) 



- D' 

(z - x)*+» 


dz 


(3-31) 


The path of integration here is understood to be some contour enclosing 
the point a: in a counter-clockwise sense. From this result, which is known 
as Schlaefli^s formula, it is possible to derive the formula: 


Pi(x) = - f lx + Va;2 - 1 cos (3-32) 

ttJq 


To do this, one may take the contour to be a circle of radius V | — 1 |, 

so that z in (31) is a* + — 1 6'^ and tp varies from — tt to +7r. The 

integral then becomes 


p,(.) = r 1^ - ' + 

2riJ^, [Vx^ - 1 

I 

"" 77” / ^ 

27rJ_^ 


This result is equivalent to (32) because the integral from ~ tt to zero is 
equal to that from zero to ir} 

^ In general, the definite integral I I{x)dx =» I I{x)dx if the integrand is an 

J-a Jo 

even function of x, that is, if /(— x) = /(x). To see this, change the variable of inte- 
gration to —X, and make a corresponding change in the limits: 

/ o po po pa 

I{x)dx = - I I{-‘x)dx = — / Iix)dx = I I(x)dx 
-a Ja Ja JO 

If I (x) is odd, that is if /(— x) = — /(x), 

/ O ^a pa 

I{x)dx — — I I(x)dXf so that / I(x)dx = 0 

■a Jo J —a 
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vX 3.4. Integral Properties of Legendre Pol 3 momials. — Integrals over 
products of Legendre polynomials, which are needed in many quantum 
mechanical problems, are best obtained with the use of Rodrigues’ formula. 
We wish to calculate 



Pi{x)Pif{x)dx 


First, we suppose that l' > I, Substituting in accordance with eq. (30) 
this integral becomes 


1 d} 

J_,dx^ 


(x^ - 1)^ . - Ifdx 


(3-33) 


After l' successive partial integrations, in which all the integrated parts 
vanish because every derivative of (x^ — l)Mszeroatx = ±1, the remain- 
ing integral reads 

2^+n\l'\ J_i d^’ ~ ~ 

But the {I + Z')-th derivative of (x^ — 1)^ is certainly zero because the 
highest power of x in (x^ — 1)^ is x^\ and I + l' is, by hypothesis, greater 
than 21. Therefore the integral vanishes. This is clearly true whenever 
Z' 9 ^ Z; for if Z should be greater than V we need only unpeel ” the deriva- 
tives appearing in (33) in the reverse manner by partial integrations, and 
we are left with an expression like (34) but with Z and V interchanged. 
Next, suppose that Z = Z'. The integral in (34) then reads 

f - 1)' ^ = (20! J 

the latter because the only term of (x^ — 1)^ which will not vanish after 21 
differentiations is x^^ But on putting x = cos 0 it is seen that 



Collecting coefficients, we find in place of (34) 

(-ly .2/^, (-1)^ • 2 

^ ‘ 3 • 5 • • • (2« + 1) " a + 1 

Our results may be combined in the formula 

J Pi{x)Pi>(x)dx = 2 / + 1 {^S 5 ) 


^e symbol here employed is freely used in mathematics and physics; 
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it is called the Kronecker ” 3 and represents a discontinuous factor which 
is taken to be unity when the two subscripts have the same value {I = Z') 
but is zero when they are not equal. 

X 3.6. Recurrence Relations between Legendre Polynomials. — Rela- 
tions between Legendre polynomials are most simply derived from Schlae- 
fli's representation of P/(x), eq. (31). We first observe that 


/ 


- ly 

(z - xY 


dz 


f 


zjz^ - 1)^ 
(z - 


dz — X 


/ 


- 1)^ 

(z - 


dz 


* 


The first term on the right of this equation, however, may be transformed as 
follows. Since 


dz 



(i + 1) 



- 1)^ 
(z - 


(z2 - lY+n 
(z - xY^^j 


and since the integral of the left-hand side around a closed contour must 
vanish, we find 


/ 


2(2^ - 1)^ 
(Z - x)'+i 


dz 


1 

2 


/ 


( z 2 - 1)'+1 
(z - x)'+2 


dz 


Equation (36) thus reads 
(z^ - D' 


/ 


(2 - xY 


• dz = 



(z* - !)'+» 
(z - x)*+2 



(z2 - 1) 
(z - x)'+» 


Reference to equation (31) allows the two terms on the right to be identi- 
fied, after multiplication by with Pi^i{x) — xPi{x). Hence 

(3-37) 

When this is differentiated with respect to a:, there results 

P,Vi(a;) ~ xPUx) = (Z + l)Pi{x) (3-38) 


which is the first important relation to be derived. It connects Legendre 
functions, and their derivatives, of degrees Z and Z + 1. 

A relation connecting Legendre polynomials of three different degrees 
may be deduced in a similar way. Clearly, since 



/ 


(2^ - 1)^ 
(2 - xY 


dz “b 2Z 


/ 


z\z^ - !)'-» 

(2 - xY 


dz - I 


f 


z(z^ - 1)^ 
(z — 


dz 


0 


In the second integrand, we introduce z^ = (z® — 1) + 1, and in the third 
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z = (z — x) + X, 80 that 


( 2 ^ - 1 )»-^ 
(2 - XY 


dz 



- 


= 0 


Now the first term appearing here may be identified by means of (37), the 
others by (31). Thus, after simple rearrangement, 


(Z + l)Pi^i(x) - (21 + l)xPi(x) + IPi^iix) = 0 (3-39) 

The remaining relations are derived by differentiation and elimination 
among formulas (38) and (39). Thus, when (39) is differentiated with 
respect to x and Pl^i is eliminated by means of (38), 

xP'i(x) - Pi^iix) = lPi(x) (3-40) 


Finally, the reader will have no difficulty in proving, by eliminations among 
(38), (39), and (40), that 

P;+i(x) - P^i(x) = (21 + l)Pi(x) (3-41) 

and 

(x^ - l)P/(x) = lxPi(x) - lPi^i(x) (3-42) 


It may be remarked that the recurrence relations here derived are also 
correct for Legendre functions having non-integral indices Z, although the 
above proof does not indicate this fact explicitly. 

3.6. Associated Legendre Pol 3 momials. — The associated Legendre 
polynomial has been defined in sec. 2.11 as 

PTix) = (1 - x^r>^ — Pi{x) (3-43) 


This definition is meaningless except when m is an integer not smaller than 
zero. In the present discussion, this will always be understood to be the 
case. 

Recurrence Relations. We first derive the more important recurrence 
relations between these functions, which, as was shown in 2.11, satisfy the 
differential equation 


d^P\ 


dPT 




dx 


+ 1 ) - 


TO 


1 - X' 


PT =0 


The function = {d^/dxl^)Pi was seen to be a solution of 


(1 - x^) - ^ Pl’”^ - 2 (to + l)x P,'”*) 
dx dx 

+ m -b 1) - to(to + 1)]P|”*) = 0 (3-44) 

When this equation is multiplied by (1 — and the definition (43) is 



103 


ASSOCIATED LEGENDRE POLYNOMIALS 


3.6 


used, it may be written 

- 2(m + 1) ■ Y-f — - Ff-^^ + [1(1 + 1) - m(m + l)]Pr = 0 

VI - 

or, on replacing m by m — 1 

- 2m — - FT + [1(1 + 1) - m(m - l)]Pr"‘ = 0 (3-45) 

Vl ~ 

This represents the fundamental relation between three associated Legendre 
functions with equal I but consecutive values of m. 

To get a similar relation for equal m but consecutive I we return to 
eqs. (39) and (41 ). Differentiating the first of these m times we have 

(I + - (21 + - (21 + l)mP,'”-‘> + lPl”L\ = 0 (3-46) 

When (41) is differentiated m — 1 times, the result is 

Plfi - P{^\ = (21 + l)P}^-^> (3-47) 

On eliminating between (46) and (47) we find 

(21 + l)a:P}"‘^ = (/ + m)Pl^\ + (Z - m + l)P}fi 
When this is multiplied by (1 — there results the dasired relation: 

xPT = (21 + ir^[(l + m)Pn.i + (l-m + l)Pr+i] (3-48) 

Two mixed relations, in which both I and m have different values, 
are often useful (cf. Chapter 11) and will now be derived. One is at once 
obtained from (47) when that equation is written with m replaced by m + 1 
and then multiplied by (1 — 

(1 - x2)i^2pm ^ (2Z + 1)”M^T4 .'i^ - Pr^i] (3-49) 

The other can be deduced from eq. (45). When xPP in (45) is eliminated 
by means of (48) it reads: 

27Yh 

■ (1 - + ~ i) + ('-“ + 

- |((i + 1) - m(m - DIPT"' 

Here Pj"“' can be expressed in terms of P|+i and PfLi by means of (49) 
(written for m — 1 instead of m). When this is done and the terms are 
collected, we find 

(1 - x 2 )»/ 2 p«+i ^ -b 1)-M(f + m)(l + m + l)Pr:, 

- (I - m)(l - m -{■ l)P;+i] (3-50a) 

For convenience in later work this may be written in a form similar to (49) 
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if only m is replaced by m — 1. Thus 

(1 - = (21 + irH(l + m)il + m- 

-(Z-m+-l)(l-m + 2)P|;T‘] 


( 3 - 60 ) 


It is seen from (49) and (50) that can be expressed in terms 

of and PT-ff, as well as PT-i* and Relations (48), (49), and 

(50) are used in calculating quantum mechanical matrix elements in 
central field problems (cf. sec. 11.13). 

Integral Properties, It is desired to evaluate the integral over the 
product of two associated Legendre functions having the same index m. 
If we use the definition (43) together with Rodrigues’ formula (30) we have 

/_, PTWPPW* - (3-51) 


where X - As was done in connection with the Legendre poly- 

nomials, we again carry out a sequence of partial integrations, + m in 
number, in which all the integrated parts are zero. The integral in (51) 
then reads 

-p^(Z”) 


= (-!)«'+« 


+ m 
X 


) 


dx^ 




(X^)dx 


Now the term of highest power in X^ is x^^j that in X^ is Therefore 
every term in the summation over X will be zero unless, simultaneously, 

l' + m — \ ^ 2m and I + m + \ ^ 21 (3-52) 

The first of these implies: X ^ Z' — m, the second X ^ Z — m. Let us 
suppose that I < l'. Since m is positive, these two relations are incom- 
patible, and the summation contains no term which is different from zero. 
Hence the integral (51) vanishes. If Z > l', it must also be zero because 
the integrand is perfectly symmetrical with respect to Z and Z'. To show 
this result explicitly, the partial integrations must be performed in the 
reverse manner. 

If Z' = Z the two relations (52) are indeed compatible, but only for the 
single value X = Z — m. Hence the sum over X contains only one term, 
and the integral becomes 

n + m\^ ^ , 

4 - m) dx^^ ^ 

(l + rn\ r 


(- 1 ) 


J-i-tn 




1 



105 


ADDITION THEOREM FOR LEGENDRE POLYNOMIALS 


3.7 


The remaining integral has already been computed (preceding eq. 35); 
it was found to be 

(_i)«22'+i(n)2 

On the other hand, f ! = ,, Thus we find, collect- 

\t — m/ {I — m)\ {2m)\ 

ing the various factors, 


j\FTix)fdx = 


(-1)” (l + m)l(-iy 


(20! (2m) I 


(_l)i22m(jj)2 

(2f + 1)1 


I v-.i ^ - 22 ^(i !)2 (l-m)\{2m)\ (2f + 1)1 

_ (I + m)! ^ 2 

" (f - m) ! * 2i + 1 
It has thus been proved that 

If a; is taken to be cos 6j this result may be written in the equivalent form: 

r r ('i 4“ m') ^ 2 

TYm / . /i\ r>tn / /\\ ^ /O 


(3-53) 


f PTicos 6)P7 (cos 6) sin Odd = 


(I - m)l 21 + 1 


di'i (3-53a) 


)C 3.7. Addition Theorem for Legendre Pol 3 momials. — To prove the 
famous addition theorem for Legendre polynomials (eq. 61) it is necessary 
first to establish a formula due to Heine. If we substitute the Schlaefli 
integral, eq. (31), for P/ in the definition of Pf (eq. 43) and carryout the 
differentiations with respect to x under the integral sign, we have 

PT{x) = |-. (^ + l)(^ + 2) • • • (f + m)(l - X 

27rt 


^ (z* — l)*(z — x)" 


Now let z = X + ''/x* — 1 e’* and integrate over <f> from — ir to tt in accord- 
ance with the meaning of the contour (cf. eq. 31 et seq.). Then 

. g+lW + 2)...g + ».) ^ 

27r 

r' [x + — 1 cos 0]* 


[Vx2 _ 1 


(;-H)(f-f 2)--- (I-l-m) 


(_l)m/2 X 


J* [x -|- Vi* — 1 cos 
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PTix) 


a + i)a + 2) •••(; + 

TT 


X 



1 cos <t)]^ cos m(l)d<l> (3-54a) 


In taking the last step we observe that, of the two constituents of 
^ g*^ first is an even, the second an odd function 

of 0. Since the other remaining factor of the integrand is even, only the 
cosine part of will give a finite integral, and this has twice the value of 
the integral between the limits 0 and tt. Eq. (54a) is Heine’s formula. 

If in the differential equation for PT(^) w’e substitute — Z ~ 1 for Z, the 
equation remains unaltered. Therefore = PP- view of this, 

Heine’s formula may also be written 


prfr) - p--,-, - X 


J [x + ^x^ — 1 cos 0] ^ ^ cos 7n<j)d<t> 

n 


1(1- l) ■ • • (I - m + 


f 


X 

cos m<}>d(i> 


(X + — 1 cos 

To prove the addition theorem we consider the equation 


(3-54b) 


j [xi + (xi - cos (w - a)]' 


i?o^ [X2 + (xl - l)^^^cosa]*+‘ 

= {x 2 + (xl — cos a — p[xi + (xf — 1)^'^ cos (w — a)]}~^ (3-55) 


which is an identity for sufficiently small values of the parameter p. All 
other quantities appearing in (55) are supposed to be real, but otherwise 
unrestricted. This relation is simply an application of the expansion 

£ x' = (1 - x)“S I a: 1 < 1 
1=0 


Let us integrate eq. (55) over da from — tt to tt. The integral on the right 
may be evaluated by means of the formula 



da 

a + 6 cos a + c sin a 


= 27r(a2 - 52 _ 
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which was proved as an example on page 89. Here 
a = X2 — pxi 

6 = (X 2 — 1)^^^ — p(xl — 1)^^^ cos CO 
c = —p{xi — sin CO 

Hence the right-hand side of (55) becomes after integration 

27r{ 1 — 2 p[xiX 2 ~ (x\ — iy^^(xl — 1)^^^ cos co] + 

As will be seen forthwith, the expression in [ ] appearing here has a very 
simple geometrical meaning. For the prcvsent, let us designate it by x: 

X = X 1 X 2 — {x\ — — 1)^^^ cos CO (3-56) 


The result of the integration may therefore be written 

27r(l - 2px -h 

But by the theorem on Legendre functions, eq. (24), this is 


2r 'Z p'Plix) 


i=o 


The left hand side of (55) may be integrated term by term because the 
expression is assumed to converge. On comparing coefficients of p^ we 
see, therefore, that 




'[Xi + (^1 - 1)^^^ cos (co - a)Y 

r N + (4 - 1)^^^ COS 


da '(3-57) 


The last step of the proof involves an expansion of Pi{x) in a Fourier 
series.^ Clearly, Pi{x)^ being a polynomial of degree I in cos co, can be 
expressed in the form 

i 

Pi(x) ^ ^co + Cm cos 7m) (3-58) 

m = l 


The coefficients Cm are given by 
1 C" 

Cm = ~ I Pi(^) COS mcodco 
TT J -- 


■ 2? 


[xi + (a:? — 1)^^® cos (<o — a)]* 


cos nua 


[X2 + (xl — 1)*^^ COS 

when P;(x) is replaced in accordance with (58). In this integral, we may 
® See sec. 8.2. 
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introduce the variable co • a ^ in place of co, so that 





da 

[Xi + (xl - 1)*'^ 


^ f [^1 + (x\ - 1)‘'®C0S^]*X 

cos aj ^ J 

(cos ma cos mp — sin ma sin m^)d^ 


The integration with respect to 0 over the term containing sin mp is obvi- 
ously zero because the integrand is odd. The other term can be evaluated 
by means of eq. (54a). The result is 


Cm = 


_L r 

2ir2 J_,| 


COS mada 


27r(-l)-^/2 


2ir^ J_,[x 2 + (x| — 1)^^^ cos «]'■'■* (I + l)(l + 2) • • • (Z + 

In the remaining integration over a we use (54b), obtaining 

Hence from (58): 

i (1 — t 

Pi{x) = Pi{xi)Pi{x 2 ) + 2 z }rr ~ PTMPTiXi) cos rm (3-59) 

m = l [L -T m)l 


which is the desired addition theorem. 

Finally, let us investigate the meaning of x defined in eq. (56). If 
^ 1 , ipi and 02, <P 2 denote, respectively, the polar and azimuthal angles of 
two lines passing through the origin, then 0, the angle between these two 
lines, is given by 

cos 0 = cos 01 cos 02 + sin 0i sin 02 cos {(pi — <^ 2 ) (3-60) 

Thus, if in (56) the following identifications are made: 

X = cos 0, Xi = cos 01, X 2 = cos 02, CO = — ^2 

then (59) becomes 

I (I ^ ^)i 

Piicos 0) = Pj(cos ^i) PiicosOz) + 2 Z V-; ^ X 

m = l (t + m)l 

PP(co 3 61 ) Pr(cos 02 ) cos m(^i — ^ 2 ) (3-61) 


In quantum mechanics (cf. sec. 11.12) it is convenient to use associated 
Legendre functions which differ from PPix) by factors depending on I and 
m, but constant with respect to x and so chosen that the integral over the 
square of the functions is unity. These functions are called “ normalized ” 
associated Legendre functions. They will here be denoted by n”. Let 

psput nfC®) “ Ni,mPTix). Then if we wish J* [nr(cos d)f sin ddd to 
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be equal to unity we must, in view of (53), put 

'21 + 1 (f - m)! 


Ni, 


= [- 


2 (i “h ! 



It is also customary to permit the index m of nf* to be negative and to 
define 


= nr(x) 




21 + 1 (l-\m |)! Y^^ 
2 (Z + I TO j) 




(3-62) 


The index m may then take on all integral values including zero from ~ 
to +Z, while I is always a positive integer. In terms of the functions 11” 
the addition theorem takes a particularly simple form: 


4 


21 1 


n?(cos 0) = nr(cos ^i)nr(co8 ^ 2 ) cos m(<pi — (P 2 ) (3-61a) 


One may also replace the factor cos m{(pi — ip 2 ) in each term of this summa- 
tion by because each pair of terms corresponding to +m and —m 

then yields a cosine function. 

Bessel Functions. — In sec. 2.14 we have shown that a particular 
solution of Bessers differential equation, (eq. 2-57), is the Bessel function 
of order n,’’ defined as (cf. eq. 2-60) 


It is of interest, first, to note that for integral n, Jn{^) is the coefficient of u” 
in the expansion of exp [(.t/2)(u — l/u)]. In fact Jn{'^) for integral n, 
called Bessel’s coefficient, was originally defined by means of this relation. 
To prove it we merely expand the exponential, using the binomial theorem® 
to express {u — l/u)": 




= L 


’:x{y- X)!X! 


If we now put V — 2\ = n, this becomes 



Dm" 


ff (-1)" (xy+^^1 

Lx-o(m + X)!X! V2/ J 


(3-64) 


® The binomial theorem states: 
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For integral n, the bracket appearing here is identical with the expansion 
(63); hence the above-mentioned theorem is proved. 

From (64) an integral representation may be derived quite simply. 
By the theorem of residues (eq. 3) the coefficient of in an expansion of 
f{z) is given by 


the integral being taken in a counter-clockwise sense about 0=0. Simi- 
larly, the coefficient of z^ in the expansion of f{z) will be: 


an = 



M 


dz 


The theorem just proved is therefore tantamount to the relation 

J„(x) = ^ (3-65) 


It is customary to write this result in a slightly different form, obtainable 
on replacement of the variable u by 2t/x. Eq. (65) then reads 

■ 2S © / “P [' - I] 


While this integral has been shown here to be identical with the convergent 
sum of eq. (63) only if n is an integer, a more special consideration 
would indeed establish the equivalence of (63) and (66) for non-integral 
n also. A simple way to prove this fact is to show, by substitution, that 
(66) satisfies Bessers differential equation. On performing the differentia- 
tions indicated on the left of eq. (2-57)^ and substituting therein, we find 


-n+2 


^n+2 






n + 1 

, ^'1 

1 (. 

1 

t 


exp 1 1 

exp 


1 


|Jd< = 


(‘-ID 


dt 


because the integral around a closed loop of an exact differential is zero. 
We may therefore regard either (63) or (66) as a definition of Jn(^) for both 
integral and non-integral values of n. For non-integral n, however, caution 
is required in the choice of the contour of integration in (66). This must 
clearly enclose the origin. But if we were to take, for example, a circle 
about the origin as center we should encounter a difficulty. For non-inte- 


^ The differentiations may be carried out in (66) without regard to the fixed path of 
integration, that is, “ under the integral sign.'' 
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gral n the integrand is a many-valued function of t Thus if the amplitude 
of t should vary, say, from ~ tt to +x, the integrand will not have performed 
a closed loop^ and the last equation above would not be true. It is neces- 
sary, therefore, to select a path of integration which (a) encloses the origin 
and no other singularity of the integrand; (b) starts and ends at a point in 
the ^-plane which will cause the integrand to perform a closed loop also. 



Fig. 3“4 


Such a path is that illustrated in Fig. 4. Whenever eq. (65) or (66) is used, 
we shall understand that the contour integral is taken along this path. 
(The reader familiar with the theory of man 3 ^-valued functions will observe 
that this path confines the integrand entirel}^ to one of its branches pro- 
vided that the argument of t is given its principal value.) 

^Recurrence Relations. From (66) one may show by direct differentia- 
tion that 

[x-V„(x)] = -x-V„+i(x) (3-67) 

ax 

or, when the differentiation on the left is carried out, 

J'(x) = - Jn{x) - Jn+l{x) (3-68) 

X 


To obtain the other fundamental recurrence formula we perform the 
differentiations in the equation 


/ 1 ■ 5)] * ■ " 




® As an example of many-valued functions, consider ^ When 

t moves along a circle of unit radius from — x to -f-x, initial and final points are identical. 
But ^ t has the initial point = —i and the final point = -|-t. 
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When use is now made of the definition (66) this reads 

2wi QY ' ^ j = 0 


hence 


J^t(x) + Jr^lix) = - Jnix) 

X 


On eliminating Jn-f-i from (68) by means of (69) there results 

J'(x) = Jn-lix) - - J„(x) 

X 

and from this and (68) 

Jn(x) = fi — 1(^) *^n+l(^)] 

"^^^esseVs Integral. Let us consider Jn(x) as defined by eq. (65): 
Jn(x) = 


(3-69) 

(3-70) 

(3-71) 


We may free ourselves from the condition that n be an integer by choosing 
a path like that of Fig. 4. More specifically, the contour will be taken to 
start at — 00 , pass to the right below the real axis {u = oo > t > 1) 
up to the point —1, then perform a circle of unit radius in a counter-clock- 
wise sense about the origin (u *= — tt < 0 < tt) and finally to return to 

— 00 above the real axis (u = +l:^f<+oo). The contour integral 

then becomes 

"I r* 1 X 

^ I g-«i»+««nO^ _ ^-(n+l)« / 

2ir 2iri *'+1 2 



The second of these int^als may be written 


1 

IT 



COS (n^ — X sin S)dd 


because the odd part of the exponential, sin (nS — x sin ^), vanishes on into- 
gration between — tt and +t. The first and last may be transformed by 
putting t = and noting that 

= 2 sinh 6 


When they are combined, the result is 

A [g(»+l)« _ g-(»+I)<rJ r -nD-x^rAt ^ = - - sb UTT T g-"®-* 
27n %/ 0 TT t/ 0 
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Hence 




008 (nd — X sin d)dd — 


sin ut J* exp { — nd — x sinh B)d^ (3-72) 

This is a generalized form of BesseVs integraly derived by Bessel for integral 
values of n. In that special case the second integral vanishes and 

Jn{x) = “ r cos (n^ ~ X sin d)ddy n = integer (3“72a) 
ttJq 

^Bessel Functions of Half -Odd Order, When n is half an odd integer, 
for instance, p + |, JnM takes a particularly simple form and is related 
closely to the trigonometric functions. To show this, let us first compute 
Jh 2 M by the expansion (63). We may then use the recurrence formulas 
to obtain J 3 / 2 , etc. Thus, 

Jll2{X)=\-] i-o2XM T./3 . XN 


But in view of eq. (5), etc., 
r(t + X) = 


2/ X 22'‘X! r(^ + X) 


2X 4- 1 2X - 1 


) 


2*>XI 2”+‘X! 

When this is substituted in the series for Jj /2 (x), there results 


Jinix) = 


From (67), 


L ^ ^ 

X (2X + 1)! 


_ /AV 
1)! Vt VW 

= (tT 


'V' ^ 

X (2X + 1)! 


(3-73) 






This process may be continued if the explicit form of the functions Jp^n 2 {^) 
is desired. A general formula is readily obtainable as follows. Eq. (67) 
may be written 
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or, by repeated application of (67) 


x-"-V„+p(x) = (-2)*’jpyp[a;“V„(x)] 


Hence, on putting n = \, 


d” 


(-l)P(2x)P+‘'2 


AI2 


d{x^) 


/ sin x\ 

^ V~ / 


(3-74) 


(3-75) 


The first few functions of half-odd order are given below. 


V 

*^^+ 1 / 2 ( 2 ^) 


0 

sin X 


COS X 



sin X 



cos X 

1 

— 

cos X 

—sin X “ 

— 


X 



X 



\ . 3 

3 

/3 \ 

2 

( 1 

1 sin a; cos x 

- sin X + 1 -x — 1 1 cos X 


\x^ 

/ X 

X 

\x‘ / 


/15 

A 

/15 

A . /15 6\ 

3 

( 73 -; 

- 1 sin X — ( — ^ ~ ^ ^ 

-h- 

-lJs.nx-(^~3--jco8x 


/105 

45 \ . 

/105 

10\ . 

4 


^ + 1 J sin I 


— ) Sin X 

X/ 



/105 10\ 


/105 45 


— 

I “i; ) COS X 

4- 

1 ‘2 + 4 I j cos X 



\x^ X / 


\x^ x^ ‘ / 


When the differentiations in (75) are carried out it is easily established 
that the asymptotic form of /p+i /2 is given, for all p, by 

lim Jp+i/ 2 (x) = 8in^x-p0 (3-76) 


3.9. Hankel Functions and Summary on Bessel Functions. — The 

Bessel function JnM is only one particular solution of BesseFs differential 
equation. However, as was noted in sec. 2.14, J^n{x) is also a particular 
solution, for the differential equation is insensitive to the substitution of 
— n for n. Hence a general solution of the form 


y = aJn(ii^) + bJ^nix) (3-77) 

is at hand provided and J^n are different functions, i.e., are linearly inde- 
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pendent. As was also shown in sec. 2.14, this is true as long as n is not an 
integer. The Hankel function^ frequently used in physical problems,® is of 
interest only in connection with non-integral n and the following remarks 
are restricted to that case. This function is a solution of BessePs equation 
of the form (77) with the constants a and h suitably chosen. We distin- 
guish two kinds of Hankel function, generally denoted by and : 

. > 

[e-"'V„(x) - y_„(x)] 

sin rnt 

f (3-78) 

Hf = Ie"'*7„(x) - J_„(x)] 

sm nTT 

Hence, conversely, 

Jn{x) = ^[H<‘>(x) + H«)(x)] 

J-n{x) = + e-— ^^(x)] 

These definitions hold, of course, for complex as well as for real values of the 
argument. Hankel functions are particularly useful for complex argu- 
ments, for they vanish strongly when the modulus of the argument 
approaches infinity, which is a requirement in many physical problems. 

The qualitative properties of Bessel and Hankel functions may be 
summarized in the following brief survey. 

A. Jn{z) is real if z is real, complex if z is complex. 

1. At X = 0, 

1 if n = 0 

J (x) =\ 0 if n > 0 

^ I 0 if n < 0, and n is an integer 

00 if n < 0, and n is not an integer 

2. At X— > 00 , all Jn{^) oscillate, but with ever-decreasing amp- 

litude (provided x is real). 


lim Jn{^) 

X— ► 00 


® See, for instance, Stratton, J., “ Electromagnetic Theory,” McGraw-Hill Book Co., 
1941. For applications in: propagation of radio waves, cf. Sommerfeld, A., Ann. der 
Phys. 28, 692 (1909); theory of optical diffraction, cf. Wolfsohn, G., Handb. d. Phys.^ 
XX, p. 282; quantum mechanics, cf. Margenau, H., Phys. Rev. 46, 613 (1934). 


h i A., . 

\l — cos I X — 7 I if n IS even 
> TX \ 4/ 

-J— sin (x — 7 1 if n is odd 

^ ttx \ 4/ 


(3-79) 
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B. Hn{z) is complex if 2 is real, but and 

are always real if x is real and > 0. 

1. At 2 =0, both Hn^ and become infinite. In fact 

limi“+^Hl“(ia:) = lim (-) 

as— *— ►O \^/ 

2. At z 00 , either (z) or (z) vanishes exponentially. 

0 if the imaginary part of z > 0 

lim H^\z) = 

[ 00 if the imaginary part of z < 0 
00 if the imaginary part of z > 0 

lim = 

**' ”*" “ [ 0 if the imaginary part of z < 0 

The behavior at infinity of both Jn and Hn is most easily remembered by 
noting the general similarity between 

H^^\z) and 6'" 

and 

Jn(z) and 4* = cos z 

The important difference between the Bessel functions and the circular 
functions is in the fact that the former have neither constant amplitude nor 
constant wave length. 

)( Useful Formulas Involving Bessel Functions. We conclude the discus- 
sion of Bessel functions by appending here a list of formulas involving 
Bessel functions. Some of these are easily proved with the use of the 
theory here developed; to establish others reference should be made to 
more comprehensive treatises, such as that of Nielsen^^ and that of Gray 
and Mathews.^ ^ An extensive table of differential equations having Bessel 
functions as solutions is given in Jahnke and Emde.^^ 

I Jq{^) 1^1; I Jn{x) I ^ for n ^ 1; x real 

n 

J_„(2)J^l(2) + J_„+x(2)J„(2) = 

TTZ 

Nielsen, N., “ Handbuch der Theorie der Cylinderfunctionen,’^ Teubner, 1904. 
Gray, A., and Mathews, G. B., A Treatise on Bessel Functions,’^ Macmillan, 
London, 1922. 

Jahnke, E., and Emde, F., Funktionentafeln,” Teubner, 1909, pp. 166-168. 
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/■ 




+ 


m — 1 
m 


[Jn+l{x)Jn-l{x) - iJnix)^] 


m 


r x”^[j^{x)?dx 

— 1 «/o 


provided n is a positive integer and m + 1 > 0 

J„_i(x)ffW(x) - J„(a:)H»ii(x) = H?.,{x)Jn{x) - H'^{x)J^i{x) 

TTlX 

+ >^) - (> + fji ^ + 5 xiT 

Jn{x)dx = 2 52 ^n+ 2 X+l(x) 

0 x=o 

r x[J„{ax)fdx = {[/„(ax)]2 - J„_i(a3;)J„+i(ax)} 

t/o ^ 

This formula is also valid when all J’s are replaced by or 


, r x-"+™j„(ar)dx = 

Jo 




13 


if 2n + 1 > m > - 1. 

/2\i/2 

lim Jniz) = ( “ ) X 

l^l^oa \‘^2/ 

j '* r / s r /., xj fixJn(ax)Jn-l(Px) - OO: J„_i (ox) J„ (/3x) 

xJn(ax)Jn(fix)dx - 2 ^2 

0 OL p 

‘^S.IO. Hermite Polynomials and Functions. — In sec. 2.15 the Hermite 
polynomial of degree n has been defined as the polynomial solution of 
Hermite’s differential equation 

y” - 2 x 1 /' + 2nj/ = 0 


** The notation 0(1 /«*) is to be read “ terms of the order l/z*." 


(3-80) 
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Such solutions were seen to exist when n is an integer. Explicitly, y - 


Hnix) = (2xr - 




1 ! 


(2xy-^ + 


n{n - l)(n - 2)(n - 3) 


2 ! 


(2x)’‘~* - 


(3-81) 


We shall now find an equivalent expression for Hn in terms of a definite 
integral. If we put 


Vn 




(3-82) 


27ri J 

and take the contour around a circle which has the origin as its center, then 

^ =7^- f 2z-"e**-“-*>’dz (3-82a) 

dx 2-10, J 

and 

^ (f 4z“"+V’-'*-*^*dz 


dx^ 27rl J 


The differentiations here may be performed under the integral sign. When 
these derivatives are substituted on the left of the differential equation (80), 
it is found that 


y" - 2xy'„ + 2nyn “ ^ ^ + 2n)e*'“'‘ *'’z " *dz 


7 ^. («~V'“^*-*’’)dz = 0 

2TnJ dz ^ 


The last step follows because the contents of the parenthesis, being a 
single-valued function of z, if n is an integer, takes the same value at the 
initial and final points of the contour integration. It has thus been shown 
that expression (82) is also a solution of Hermite's equation. Since it 
represents a polynomial in x it must be identical with Hn(x) except for a 
constant multiplier. This constant may be found by computing, for 
example, Hn(0) from (81) and i/n(0) from (82) for even n (since otherwise 
Hn{0) would vanish). Eq. (81) gives 


HniO) 





The function pn defined by (82) is a solution of Hennite^s differential equation even 
when n is non-integral, but in that case the contour must be specified differently: to 
make the integrand return to its original value, the path must start at -f ®, go in toward 
the origin, encircle it in a counter-clockwise sense and return to 4“®* 
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while from (82) we obtain 



f 




(_ l )»/2 



by the theorem of residues (eq. 3). Hence we see that is n ! times y„'. 





(3-83) 


This result may be expressed in a different way. On examining (82) in the 
light of the theorem of residues it is apparent that y„ is the coefficient of 
z” in the expansion of as a power series in z. Hence 

= £ y„z’' = 2: z” (3-84) 

n = 0 n 'fll 

Recurrence relations between Hermite polynomials of different degree 
are easily derived. The first is implicit in eq. (82a), which may now be 
written y'(x) = 2y^i{x), or 

H'(x) = 2nH„_i(x) (3-85) 

The second follows from the differential equation 

H''(x) - 2xH' + 2nH„ = 0 (3-86) 

Others may be derived ad libitum by repeated application of (85): 

H'n = 4n(n — l)Hn -2 etc. 


Thus far two representations of Hn have been obtained, the series form 
(81) and the integral form (83). A third may be deduced from (84). Let 
us take the n-th derivative with respect to z on both sides. The left 
becomes 

e*’ e- = e*’ ( - 1)" e" (*-*>’ 

az” ^ ax’* 


and the right simply 

Hn{x) + Hn^l{x)z + • • . 

These two expressions are equal for all values of 2 . On putting 2=0, there 
results 

H„ (x) = ( - 1)**®*’ ~ e-»’ (3-87) 

A function closely related to the polynomial H„(x) was introduced in 
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sec. 2.16. It is 

y = (3-88) 


and satisfies the differential equation 

y" 4- (1 - x2 -f- 2n)y = 0 (3-89) 

The function defined by (88) is called the Hermite (orthogonal) function. 
It is of interest because it appears (cf. sec. 11.11) as an eigenfunction in the 
quantum mechanical problem of the simple harmonic oscillator. We shall 
here derive a few integrals involving this function which will be found 
useful later. 

The first is the integral over the product of two Hermite functions, 

J e--^Hn(x)H.nix)dx 


In view of (84) 

(*i— a;)* , *)* 




H,(x) 

Ml 



Hence, multiplying each side by e and integrating 

£ e-^’Hx(x)H^{x)dx'^ = J e*^<*>-*>‘-<*>-*>*d* (3-90) 

The integral on the right has the value^^ 

This may be expanded to read 

. v; 2 : ^ Z ^ ^25 «x^ (3-91) 

X A I X/1 A ! 


where the single summation over X has been changed into a double summa- 
tion over X and m by the artificial use of the Kronecker 6-symbol, defined 
on page 100. Since (90) is true for every value of zi and Z 2 , the individual 
coefficients of every power of zi and Z 2 in both expansions must be equal. 
On comparing (91) with the left side of (90) we see that 

./- 2\ 



or 


J e-^Hnix)H„(x)dx = 2»n! (3-92) 


In evaluating it, use is made of the formula: 
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The integral 


J* xe ^^Iin{x)Hm{x)dx 


can be evaluated in a similar way. In place of (90) we now write 


r r r xe--'Hx{x)H^{x)dx 

Xm — <» * 


= yr{zi + Z2)e^“** 


The last result is, on expansion, 


^ 2\\zl 


Equating coefficients of ZiZ 2 then yields 

J xe-^'Hn(x)H„ix)dx = v^(2"-in!5„.n-i + 2"(n + 1)!5 „.,h.i) 

(3-93) 

The integral vanishes when n m and also when n and m differ by more 
than unity. The same method may be used to calculate other integrals 
of the type 


J x^e-^"H„ix)H„,{x)dx 


Later, however, we shall learn of simpler ways, involving matrix algebra, 
for deriving these from the result established in eq. (93), (See problem at 
the end of sec. 11.17.) 

Example. A simple harmonic oscillator, if treated by the methods of 
quantum mechanics, has a distribution of mass about the attracting center 
which is given by 

P({) = cc-P[^„(f)]2 

where { = p being a quantity characteristic of the oscillator, and n is 
a quantum number which depends on the total energy possessed by the 
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vibrating point. The moment of inertia of this mass distribution is given 
by 

ni? x^e-^\Hnm^dx ^ J” e-^[H{i)fdx 

= ^/” ee-^Wnim^di / /” e-^\Hn{i)?di 

The integral in the denominator has already been calculated (cf. 92) and is 
equal to 2"n!\/T. The integral in the numerator may be computed by the 
same method and is found to be 


Hence 


2n + 1 
2 


2"n ! v’ T 


mx^ 


TO 2n + 1 

J 2 


Later (cf. sec. 11.11) it will be shown that /3 = Air^mvQ/h so that 

h 


~~2 2n H” 1 
mx = 




2 4t^»’o 

where vo is the “ classical ” frequency of the oscillator. 


LIST OF HERMITE POLYNOMIALS 


Ho («) = 1 
Hi (I) = 

Hi (0 = - 2 

Ho (0 = - 12{ 

Hi «) = 36{^ - + 12 

Ho ({) = 32|5 - 160i« 4- 1204 

ffg ({) = 64{® - 480|< + 720?* - 120 

Hi (?) = 128?* - 1344?'' + 3360?* - 1680? 

Ho (?) = 256?* - 3584?* + 13440?'' - 13440?* + 1680 

Ho (?) = 512?* - 9216?* + 48384?* - 80640?* + 30240? 

Hio(?) = 1024?'* - 23040?* + 161280?* - 403200?'' + 302400?* - 30240 


3.11. Laguerre Polynomials and Functions. — The theory of Laguerre 
polynomials may be developed along lines very similar to those of the last 
section. A Laguerre polynomial L„(a:) has been defined in sec. 2.16 as the 
polynomial solution of Laguerre’s differential eq. (2-67) : 

xy” + (1 - x)y' + nj/ = 0 (3-94) 


It exists whenever n is an integer and was found to be 

, nHn - 1)^ 


L„(x) = (-l)’*(x»-^x"-' + - 


+ ...( 


-I)**?!! ^ 
(3-95) 


2 ! 
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We first establish a representation of Ln in the form of a definite integral. 
Consider 


Vn = 




-n— 1 


27^^ J 1 


exp 


(a) 


dz 


(3-96) 


where the contour is taken to include the origin. Differentiations with 
respect to x may be performed under the integral sign; hence 


and 


/ 1 ^ 2 "^ / —xz\ 

(t^)- 




dz 




On substituting in the left-hand side of (94) we find 


2Tri J 


xz 


(1 - 


(1 — x)z 
1—2 


" Iz -"-! / - XZ \ 

fn exp(- )dz 

J 1 - 2 \1 - 2/ 


But this is easily seen to be 

_ J-fi 

2Tri J dz 


[.-exp(^^)]& 


an expression which vanishes because the quantity in brackets takes on the 
same value at the initial and final point of the contour. Hence (96) is a 
solution of Laguerre’s differential equation. Moreover, it is a polynomial, 
as an analysis in the light of the theorem of residues will show. Its rela- 
tion to Ln(x) may be established by computing both Vnix) and Ln{x) for 
a particular value of x, say zero. From (95) 


from (96) 


Lr,{0) = n\ 


2/n(0) 





2-"-'(l+2 + z2 + ...)d2 = 1 


Therefore 


Ln = n\yn 


Again, using the theorem of residues (eq. 3), we find, since 



that 


(l-x) ‘exp(j^) 


i. l/nZ” 

n-0 



This result is quite similar to (84). 


(3-97) 

(3-98) 
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Next, we turn our attention to the recurrence relations existing between 
Laguerre polynomials of different degrees, and between the derivatives of a 
given polynomial. A relation of the latter type follows at once from the 
differential equation ; 

xL;' + (1 - x)L' + nL„ = 0 (3-99) 


The former relation may be obtained by differentiating (98) with respect 
to z: 


1 — X — z 



“ LUx)z^-^ 
x%o(X - 1)! 


When the left-hand side of this equation is again expressed in terms of 
Laguerre polynomials with the use of (98), the result may be written 


(1 — X — z) 


Lx{x)z^ 

^ ^ I 


(1 - 2z + 2^) j: 

X 


(X - 1)1 


On equating the coefficients of z'^y there results 




■^n — 1 

(n- 1)! 


^n+l 2Ln Ln—1 

n! (n — 1)1 (n — 2)! 


whence, 

(1 + 2n - x)L„ - rv^Ln-i - L„+i = 0 (3-100) 

which is the relation here sought. 

For some purposes it is convenient to have L„ in the form of a derivative. 
To find it we differentiate (98) n times with respect to z and afterwards 
put z = 0, thus obtaining 

liS 5 


The reader will be able to show without difficulty that 

Hence 


Lnix) = (x"e-») 


(3-101) 


The associated Laguerre polynomial, of degree n — k, was shown in 
sec. 2.16 to satisfy the differential equation (2-70) 

xy" + (fc + 1 - x)y' + (n - A:)j/ = 0 
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and is given by 


LI 


dx‘ 


L„(x) 


(3-102) 


On differentiating (98) k times with respect to x, it is seen at once that 




2 

X = A! 


LUx) 

X! 


(3-103) 


A function of great importance in quantum mechanics is the associated 
Laguerre function^ for it describes, in a sense to be discussed fully in Chap- 
ter 11, the motion of the electron in the hydrogen atom. It satisfies differ- 
ential eq. (71) in sec. 2.16, and was there shown to be represented by 

yn.k = ( 3 - 104 ) 


Certain integrals involving this function are often used and will here be 
calculated. They are of the form 

/n.m= f e--^x^^Ll{x)Li{x) ^ x^dx 

•^0 

where p is another integer which we shall take in this work to be either 
1, 2, or 3. Furthermore, our interest will be confined to /n,n- If we multi- 
ply eq. (103) in which Z\ is written for 2 , by a similar one in which z is 
replaced by ^ 2 , there results 

z: -^Li(x)Li(x) 

Xi/il 

* ( 2 i 22)*(1 - Zl)~*~*(l - Z 2 )“*^* exp( -- - - - 

\1 - 2l 1 - 22/ 

Let us now multiply each side of this equation by and then inte- 

grate with respect to x. In view of the definition of /n.m, the result may 
be written 

i ^/x.M - (2lZ2)*l(l - Zl)(l - Z2)]-‘-> 

X,M-* XIpI 



as may be shown by r-fold partial integration. If we put 

a « (1 - + (1 ~ 22 )“' - 1 = (1 - ZiZ2){l - zir\l - ^ 2 )“' 
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we obtain, therefore, 

2-# X I I — 

x,;i 


When the denominator on the right is expanded by the binomial theorem 




( 2122 )^ 


= ^ (fc + P + >^- 1)! ( )X 


the right-hand side of (105) becomes 

\p--in ^ (fc + P + ^ — 1)! 


(1 - - Z2y 


L 

X 


X! 


( 21 ^ 2 )^+^ (3-106) 


Thus, in view of (105), 7n,n is simply (n!)^ times the coefficient of ( 2122 )^ 
of this expression. 

a. When p = 1, this is obtained by choosing that term of the summation 
in (106) for which fc + X = n, that is X = n — fc. 

In,n == {n\y/{n - k)\ (3-107a) 


b. When p = 2, (106) becomes 

^ _ 2 *+X+I^|+X _ (2^22)"+"+!] 

X X! 

The second and third terms in the bracket, in which zi and Z 2 appear with 
different exponents, cannot contribute to /n,n; the first terms contribute 
when X = n — ‘fc, the last when X = n — fc — 1. Hence 

■(n + 1)! 


In,n = (n!)2 [ 

(n!)3 


(n 


fc)! (n 


n! 1 

- fc - 1)! J 


(n - fc)! 


(2n - fc + 1) 


(3-107b) 


c. When p = 3, the significant parts of (106) are 

[(2j22)fc+X + 4(0, 22)*+^+! + (0,02)*+’^+=^] 

X XJ 

terms with different exponents of zi and Z 2 having been omitted. Conse- 
quently, 


In.n = (n!)" 

(n'.f 


r (n + 2)! 

L(n — fc)! 


+ 


4(n + 1)! 


+ 


(n fc)! 


(n — fc — 1) ! {n — k 
(6n2 - 6nfc + fc=^ + 6n - 3fc + 2) 




(3-107C) 
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Obviously, the same process permits the evaluation of /n,n for any value of 
p. The quantities /n,m iov n ^ m are rarely needed, but can be obtained 
by this method also. 

Example. The electronic charge of the hydrogen atom is distributed 
about the proton as origin in accordance with the distribution function 

P(p) = 


as will be shown in sec. 11.13. In this expression n and I stand for the 
“ total ” and ‘‘ angular momentum quantum numbers which designate 
the state of the atom; c is a constant which is different for different states, 
and p is proportional to r, the radius vector: p = (2/nao)r. The propor- 
tionality constant depends on the quantum number n and differs for differ- 
ent states or the atom, Uq is the fundamental constant known as the first 
Bohr radius.’^ P{p)y finally, represents the charge to be found within 
the spherical shell enclosed between p and p + dp. 

Let it be desired to find the mean value of 1 /r and r for this distribu- 
tion. Clearly 


_ / 

—1 _ 0 


■p(p) 


dr 


fjP{p)dr 


r“ P(p)dp 

TiCLo%) 0 p 

/o” 


The integral in the numerator is simply In^i, n-\-h with = 2i + 1 and 
p = 1, that in the denominator is also /n+L n-^i with the same fc, but with 
p = 2. Hence, using (107a and b) 

2 1 1 


^-1 = 


nuo 2n n^ao 


Similarly, 


r = 


rP{p)rdr "^f^PMpdp 

r P{p)dr f^P(p)dp 

V 0 0 

TICIq I n-\-l,n^l{p ~ 

2 In+l,n+l{p = 2 ) ’ 


with A; = 2Z + 1 


In view of (107c and b) 


2 2n 2'- 


For the ground state of the hydrogen atom (n = 1, i = 0) 

= Oq * and r = 
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3.12. Generating Functions. — A simple and powerful way of represent- 
ing functions of the more unfamiliar types is by means of generating Junc- 
tions, that is, functions of two arguments which, when expanded in a power 
series with respect to one argument, contain the functions to be generated 
as coefficients involving the other argument parametrically. Examples of 
generating functions have occurred in the preceding sections; they will here 
be exhibited once more for easy reference. 

1. Legendre Polynomials, 

(1 - 2xy + = i P,{x)y^ (Cf. eq. 24) 

j-o 


2. Associated Legendre Polynomials. 

(2m)!(l - x^r'^y”' 
2”*m!(l - 2xy + 


i:PT(x)y^ 

l=m 


This was not used in the text, but is easily derived by differentiation 
from (24) on the basis of (43). 


3. Bessel Functions (of integral order), 

exp [^ ( « - ^)] = (Cf- 64) 

4. Hermite Polynomials, 

® H (x) 

exp [x^ - (z - x)^] = L (Cf. 84) 

neio n ! 

5. Laguerre Polynomials, 

(1 - XT'- exp (:r-^) = E (Cf- 98) 

\1 — Z/ n-0 n\ 

6. Associated Laguerre Polynomials, 




(z) 

! 


7. Tschebyscheff Polynomials. 


1 — xy 
1 - 2xy + y^ 


i Tn(x)y- 

n —O 


(Not proved in text, but Cf. eq. 2-54.) 

3.13. Linear Dependence. — A set of functions ^i, ^2 • * • is said to be 
linearly dependent when a set of constants, fci, A :2 • • • kn, not all zero, 
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exists such that 

1 

If this relation can be satisfied only by putting all k\ equal to zero, the 
functions are linearly independent 

A criterion for linear dependence is easily derived. We observe that 
the integral 

I(ki •••*„)= J\ \^dx (3-108) 

taken over the range of x in which the functions <p\ are considered, cannot 
be smaller than zero. It will attain the minimum, zero, for specific values 
of the parameters k\. Now it will first be shown that, if I has a stationary 
value at all, this value must be zero. 

For this purpose, let us vary /, replacing every fcx by fcx(l + 5fc). The 
result is 

/ + 5/ = (1 + bk)H 

and 

bl = [2bk + {bkY\I 

Where I has a stationary value, bl must vanish; but it is seen that bl 
cannot vanish unless I itself is zero. Therefore the stationary value of 
(108) is zero, and we may say that the conditions 

dl dl 

^ ^ = 0, X = 1, 2, ... n (3-109) 

dk\ dkx 


are both sufficient and necessary for the vanishing of I or, what amounts to 
the same thing, for the validity of 


^k\<p\ = 0 
1 

If, therefore, eqs. (109) have a solution other than the trivial one in 
which all k\ are zero, the functions (p\ are linearly independent. 

But (108) may be written in a different way. If we define the 
coefficients 


we have 



I., — 

Xm 


and eqs. (109) now read 

Lax/xfcx = 0, Ha^xfcx = 0 

X : 


(3-110) 
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These are identical because = ax^*, so that one is merely the complex 
form of the other. Now the condition that (110) shall have a non-vanishing 
solution fci, ifc 2 , • • • fcn is that the determinant 

1 axM i =0 

This, therefore, is the condition for linear dependence of the functions ^x. 
Conversely, if | ax^ | 0, the set of functions is linearly independent. The 

determinant | a\^ | is named after the mathematician Gram. 

A simpler test, applicable when the functions ^i, • • • are differen- 
tiable n — 1 times within their range of definition, may be conducted as 
follows. If the functions are linearly dependent, 

n ^ 

= 0 

1 

= 0 ^ 

1 f 


= 0 J 

These n homogeneous equations may be regarded as determining the 
set of constants It will be shown in section 10.9 that they possess 
solutions other than fci = A :2 = ^3 * • * = 0 only if the determinant of the 

coefficients of fcx> called the Wronskian, 


<P1 

<P2 

• * 

f 

f 

f 

<pl 

<P2 

' • <Pn 

(n-1) 

>-l) 

in 


^2 



For independence of the solutions, then, the Wronskian must not 
vanish. It should be stated, however, that the vanishing of this determi- 
nant is not a sufficient condition for hnear dependence of the functions. 

3.14. Schwarz^ Inequality. — Let / and g be any two functions of x such 
that the integrals 

A = J f*fdx, B ^ J f*gdx, C = J g*gdx (3-111) 

exist. The integrations extend over any definite range of the variable x. 
Certainly the integral 

f [X/*(x) + g*(x)] [\f(x) + g{x)]dx = A\^ + (B* + B)\ + C 

in which X is to be considered as a real variable, independent of x, is always 
positive or zero (zero only when g is directly proportional to /) and hence 
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VECTOR ANALYSIS 

4.1. Definition of a Vector. — A physical quantity possessing both 
magnitude and direction is called a vector; typical examples are velocity, 
acceleration, force and angular momentum; other quantities such as mass, 
volume, temperature and time, having magnitude only, are called scalars. 
It is customary to represent vectors by letters in bold-face type and scalars 
in italics, so that A stands for a vector whose magnitude is A. This custom 
will here be followed. A vector may be indicated graphically by an arrow 
drawn between two points, tail and head of the arrow being its origin and 
terminus, respectively; the scalar part of the vector equals (or is propor- 
tional to) the length of the arrow and the direction of arrow and vector 
coincide. 

It is often necessary to locate a vector relative to a coordinate system, 
which may be done by giving the coordinates of origin and terminus. Let 
the selected coordinate system be the usual right-handed Cartesian one 
with three mutually perpendicular axes X, Y and Z so oriented that if the 
positive X-axis points towards the reader^s right and the positive F-axis 
towards the top of the page, the positive Z-axis will point up from the page 
towards the reader. Let the coordinates of origin and terminus of A be 
and {x 2 ,y 2 yZ 2 )y respectively; then the three rectangular Cartesian 
components of A,, relative to the axes X, Y, Z are defined to be 

A* = X 2 - xi] Ay = y 2 - yi] A^ = Z 2 - zi 

The length of the vector is the distance between the two points: 

A =-\/aI + 

The two points might be located relative to many other coordinate 
systems, one of which could be obtained from the previous one by rotation 
of the axes to X', Y\ Z' and translation of the origin from 0 to 0' as shown 
in Fig. 1. Suppose the coordinates of O' in the first system are (xo,yo,zo), 
then the position of the second system is determined with respect to the first 
when the angles between O'X', 0'Y\ O'Z' and OX, OY, OZ are known. 
The cosines of these nine angles are given in Table 1, where for the present 
purpose the first row and column are to be used; for example, ai 3 is the 
cosine of the angle between the two straight lines O'X' and OZ. 

132 
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TABLE 1 


OX OY OZ 



A, 

Ay 

A. 

o'x' 


Oil 

OI 2 

OI 8 

O'Y' 


021 

O22 

028 

O'Z' 

A', 

031 

O32 

038 


In order to locate the vector in the system 0'X'Y'Z\ it is necessary first 
to obtain relations between the nine direction cosines. From a well-known 



formula of solid analytic geometry, if 6 is the angle between two straight 
lines, whose angles with the coordinate axes are ai, /Si, 71, a2, 02 ) 72» 

cos 6 = cos ai cos a 2 + cos 01 cos 02 + cos 7i cos 72 (4-1) 

If the lines are perpendicular to each other, as is true for the axes OY and 
OZj cos 0 =0, hence 

^12^13 "b (^22^23 4" ^132033 = 0 (4“2) 

Five similar equations result for the other mutually perpendicular axes. 
Six further relations^ are obtained from the fact that the sum of the squares 
of the direction cosines of any line is unity; for example, for the line OX 
relative to O'X'Y'Z' 

^11 + ^21 + ^31 = 1 (4-3) 

^ To the twelve equations in (2) and (3) may be added ten more relations between 
the direction cosines. It is evident that the nine cosines are not linearly independent. 
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Now let (xiyyiyZi) and (x{,y[yi) be the coordinates of the same point P 
in OXYZ and O^X'Y'Z' and let ai, 71 , a 2 be the direction angles of 
O'P with OX, OF, OZ, O'X'. Then from ( 1 ) and Table 1 , 

x[ = O'P cos a 2 = 0'P(an cos ai + cos Pi + ais cos 71 ) 

= - ^0) + ai2iyi - yo) + a^izi - zq) 


In like manner, 

y'l = Ci2l(xi - Xq) + a 22 (yi - yo) + a23(zi - Zq) 

= Cl3l(^l ““ ^0) + ^32(^1 yo) + ^^33(2^1 ”” 2o) 


Similarly, if the components of A in O'X'Y'Z' are 
A' = 0:2 - A' = 1/2 - y [; A' = 

then, 

A' = aiiAx + ai2Ay + ^isAj 


4 - 4 


(4-4) 


and two other expressions for Ay and A' may be derived in the same way. 
These three equations may be solved for the unprimed quantities in terms 
of the primed ones, or the same method may be continued to give three 
relations like 

— cqiAj + + ctaiAj (4-5) 


All of them are symbolized, in self-explanatory fashion, in Table 1 if the 
second row and column are used. While it is usually true that the com- 
ponents of a vector are different in different reference frames, certain proper- 
ties such as the length and the angle between two vectors are equal in all 
frames. It is readily shown using ( 2 ), (3) and (5), that 

A = A' =VAi^ + A'/ + 


Considerable simplification often results in expressing physical laws 
in vector notation, without reference to a selected coordinate system. The 
transformation properties just described, however, show that it is always 
possible to list the components of a vector in any given reference frame 
when so desired. In accordance with these ideas, a vector is sometimes 
defined a;S a set of numbers (A^^AyyAz) referred to a reference frame, so that 
if the numbers are then referred to a second frame, they will become 
(A',A^,A^) with relations as given by Table 1. Provided these conditions 
are met the vector is said to be a proper vector. This analytical definition 
is more restrictive than the intuitive conception of a vector as a quantity 
possessing magnitude and direction, but it leads naturally to the more 
general idea of the tensor and it may be readily extended to the vector in 
n-dimensional space as used in many branches of modern analysis. It is 
sometimes found that physical quantities having magnitude and direction 
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do not transform according to the cosine law here illustrated. In this 
chapter, we prefer to emphasize the geometric rather than the transforma- 
tion properties but we do assume that all of the vectors discussed are proper 
vectors. 

Problem. Show that the two components 

= 2 /, Ay = X 

do not define a vector, but that 

Az = t/, Ay = -X 

represent a proper 2-dimensional vector. This example is discussed in L. Page and 
N. I. Adams, “ Electrodynamics,” p. 18, D. Van Nostrand Co., New York, 1940. 

4.2. Unit Vectors. — Vectors of unit length, drawn along the axes OX, 
OY, and OZ, respectively are called unil vectors (cf. Fig. 2); they are desig- 
nated by i, j, and k, respectively. Any directed line along either of the 



T 


Fio. 4-2 


three axes is also a vector, for if its length is Ax units along the X-axis, the 
scalar magnitude is thereby given and its direction is specified by the unit 
vector i, the whole vector being designated by Axi. Similar vectors could 
be drawn along the F- or Z-axes, Ay] or A^k. 

4.3. Addition and Subtraction of Vectors. — Referring again to Fig. 2, 
it is seen that the diagonal of the parallelogram, whose two unequal sides 
are the vectors Axi and A^k, is also a vector, its origin being taken as coinci- 
dent with the coordinate origin. From the previous discussion, it follows 
that the reference frame OXYZ is superfluous, so that the symbols A*! 
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and AM may be replaced by more general symbols, A and B. The resultant 
vector C representing the diagonal is called the vector sum of A and B, 

A + B = C 

The addition of vectors thus obeys the familiar rule for composition of 
forces in mechanics. To obtain the difference of two vectors, A — B, it is 
only necessary to define the negative of a vector. This is taken to mean 
a vector whose length is equal and whose direction is opposite to that of the 
original vector. Thus A — B=A+(— B). Hence the rule: To form 
the difference of two vectors graphically, reverse the direction of the minu- 
end and complete the parallelogram as before. 

From the parallelogram law, it is seen that any vector in a plane may be 
resolved in numerous ways into two components in the same plane, and 
that a vector in space may be resolved in numerous ways into three com- 
ponents, not in the same plane. If the resolution is made along the rectan- 
gular axes, the result may be symbolized in terms of unit vectors, 

C = + A^\l 

and 

R = Axi + Ay} + A,k (4r-6) 

From the geometry of Fig. 2, the lengths of C and R are 
C = iAl + 

R = (Al + Al + Al)^ (4-7) 

j The laws which govern addition and subtraction of vectors are easily 
jseen to be associative, commutative, and distributive. Multiplication 
of a vector by a scalar is understood to mean multiplication of its length 
by the scalar factor, without change in its direction. Vector algebra thus 
developed enables one to demonstrate many geometrical theorems in a 
simple way.^ 

Problem a. Prove that the diagonals of a parallelogram bisect each other. 

Problem b. Prove that the line that joins one corner of a parallelogram to the 
middle point of an opposite side trisects the diagonal and is trisected by it. 

4.4. The Scalar Product of Two Vectors. — ^The scalar (or inner) prod- 
iLct^ of two vectors is defined by 

A • B = AB cos d (4-8) 

where d is the angle between A and B. It follows that the scalar product of 

* Numerous examples may be found in books on vector analysis; see for example: 
Phillips, H. B., “ Vector Analysis,*' John Wiley and Sons, New York, 1933; Gibbs- 
Wilson, '' Vector Analysis,*’ Yale University Press, New Haven, Conn., 1925. 

^ Also called the dot product. 
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two perpendicular unit vectors must vanish since d = 7r/2, cos 0=0. 
Similarly the scalar product of a unit vector by itself must equal unity 
since 0=0, cos 0 = 1. In vector notation, 

ij =ji=ik=ki = jk=kj = 0 (4-9) 

i • i = j • j = k • k = = 1 (4-10) 

If A = B, 0 = 0, so from (8) and (10), 

A • A = + A^ -f- A^^ 



an equation which defines the square of the length of A (see also Fig. 3). 
If 

A • B = 0 (4-11) 

for any two vectors, A and B are perpendicular to each other, unless one 
vanishes; if 

AB = 

then A and B are parallel. In a Cartesian system, 

A • B = AxBx + AyBy + AzBz (4-12) 

The scalar product obeys the rules of ordinary multiplication 

AB = B- A 

A. (B + C) = (A-B) + (A-C) 

From (8), it is seen that any relation involving the cosine of an included 
angle may be written in terms of the scalar product. For example, the 
mechanical work W done by a force F which makes an angle 0 with the 
displacement D is IF = FD cos 0 or in vector notation, W = F • D. 

Problem. If A and B are the sides of a parallelogram and C, D are the diagonals, 
show that C* + = 2(A* + B^); = 4/lBco8 A. 

4.6. The Vector Product of Two Vectors. — Let two arbitrary vectors, 
A and B, be drawn from a common origin 0 with an included angle 6, 
0 < < JT, and let C be a vector perpendicular to S, the plane of A and B 
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(cf. Fig. 4). Then from (11) and (12) 

C ■ A = CAx + CyAy + CyAz = 0 
C • B = CxBx + CyBy CyB, = 0 

Solving, we find 

Cx = m(AyBy — AyBy) 

Cy = m{AyBy — AyBy) (4-13) 

Cy =«= m{AxBy — AyBx) 



where m is an arbitrary constant, which is conveniently taken as +1. 
Then from (6) and (13), 

= C" + cj + 0,2 = {Al A-AI + A\){Bl + 4 . 

— {AyBx + AyBy + AyBy)^ 

The first member on the right-hand side of this equals by (6), the 
second member equals (A • B)^ = {AB cos OY by (12) and (8), hence 

C2 = (A 2^2 ^ ^2^2 cos2 B) = {AB sin bY 

The vector C may thus be described as the product of two other vectors, 
A and B; it is called the vector (or skew) 'product^ and is written 

C = AXB 

Its length is C = AB sin its direction is perpendicular to the plane deter- 
mined by A and B. Using (13) and the unit vectors, we may also write 

C = A X B = {AyB, ~ A,By)i + (A^B^, - A*B,)j 
+ {A^By — AyBx))Sz 

^ Also called the cross product or outer product. 
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This may be put in the form of a determinant: 

i j k 

A X B = A^ (4-14) 

Bx By Bx 

As a consequence of (14), vector products of the unit vectors become 

iXj=— jXi= i j k =k 
10 0 
0 1 0 

jXk = -kXj=i; kXi = -iXk=j 

and 

iXi=jXj=kXk=0 (4-15) 

Eq. (14) shows that vector multiplication is not commutative, 

A X B = -B X A 

The distributive law of ordinary multiplication, however, is retained : 
AX(B + C)=AXB4-AXC 
(A + B)X(C + D) = AXC + AXD+BXC + BXD 

Problem. Prove by vector methods the trigonometric relations 

COS (x rt y) = COS X COS y ^ sin x sin y 
sin {x zL y) — sin x cos y ± cos x sin y 

Hint: Take three vectors : A = cos xi -f sin xj 
B = cos yi + sin yj 
C = cos yi — sin yj 

Form the scalar and vector products. 

The close connection between the vector C = A X B and the parallelo- 
gram whose sides are A and B suggests that it may be useful generally to 
represent areas by vectors. The convention usually adopted in this con- 
nection, with reference to plane areas, is the following. The area is repre- 
sented by a vector perpendicular to the area; and of length equal to its 
size. This leaves the direction of the vector undetermined. The latter 
is fixed relative to the sense in which the contour of the area is described: 
it is taken to be that direction in which a right-handed screw would advance 
when turned in the sense in which the contour is to be described. When the 
sense of the contour is not specified, the direction of the area vector remains 
undetermined. For closed surfaces, it is customary to draw the vector 
along the outward normal. 

We now consider two important examples of vector products. 
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a. Moment of a Force. In mechanics, the moment of a force about a 
point 0 is defined as the product of the force by its perpendicular distance 
from 0. From the geometry of Fig. 5a this product equals twice the area 
of the triangle OPQ. It may be represented as 

M = D X F 

where M, D, and F are vectors representing the moment, perpendicular 
distance and force, respectively. The sign of M, fixed by the previous 


Q 



D sin 0 
(b) 


Fig. 4-5 

definition of the area as a vector quantity, is positive on that side of the 
plane passed through 0 and the line F on which the force tends to produce 
a rotation about 0 in the positive direction. If D be drawn from 0 to any 
point in the line of action of F (cf. Fig. 5b), the perpendicular distance is 
D sin 6 and the moment is still given by the vector product, D X F. If the 
force has components Fxy Fyj F^ and D has components D*, Dy, Z)*, the 
components of M are 

Mx = {DyF, - D,Fy) 

My = iPxFx - DxF,) 

M, = {DxFy - DyFx) 

b. Angular and Linear Velocity. Suppose a rigid body is rotating about 
a fixed axis, with a, constant angular velocity of co radians per second. The 
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rotation of the body is then described by the vector ci> with length equal to 
the scalar co and direction parallel to the axis of rotation. Its sign, by the 
convention, is positive in the same direction in which a right-handed 
screw would progress under the given rotation. Any point P, not on the 
axis (cf. Fig. 6), will then describe a 
circle concentric with, and in a plane 
perpendicular to the axis, this point 
being determined by any vector R 
drawn from a point 0 on the axis of 
rotation. The linear velocity of P is at 
right angles to both a> and R, its mag- 
nitude being L = coP sin 0, or in vector 
symbols 

L = <0 X R (4-16) 

4.6. Products Involving Three Vec- 
tors. — From three arbitrary vectors, 

A, B and C, the following products 
may tentatively be formed: 

(а) A(B . C) (d) A(B X C) 

(б) A • (B X C) (e) A • (B . C) 

(c) A X (B X C) (/) A X (B . C) Fig. 4-6 

Of these expressions, (e) and (/) are meaningless since vector products 
have only been defined when vectors stand on both sides of the dot or cross. 
Furthermore, no meaning has been attached to two vectors standing 
together in the absence of one of these signs, hence (d) is of no interest 
here. 

a. Since (B • C) = BC cos ^ is a scalar, the triple product A(B • C) is a 
new vector whose direction is the same as that of A; its magnitude equals A 
multiplied by BC cos B, 

b. The product A • (B X C), called the scalar triple product, is a scalar, 
for (B X C) = D, a new vector. We have 

A-(BXC) = (BxC)*A=A-D = D- A = a scalar 

Moreover, the new vector D is perpendicular to both B and C, or from (11) 
B • (B X C) = C • (B X C) = 0 

If the three vectors A, B and C are the edges of a parallelepiped, as shown 
in Fig. 7, then (B X C) is a vector whose length equals the area of the 
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parallelogram forming the base of the parallelepiped; its direction is per- 
pendicular to the plane of B and C. The scalar triple product is thus the 
area of the base multiplied by the projection of the slant height of A on the 



vector (B X C) or a scalar whose magnitude equals the volume of the paral- 
lelepiped, V. By taking various faces in turn, we find from Fig. 7, 

A • (B X C) = B • (C X A) = C • (A X B) = r 

Since a change of order in the vector product changes the sign there are 
other possible relations which may be abbreviated by writing 

V = [ABC] = A • (B X C) = (B X C) • A, etc. (4-17) 

and 

[ABC] = [BCA] = [CAB] = -[ACB] = -[BAC] = -[CBA] (4-18) 

Each term in square brackets stands for the two possible ways of writing 
the triple product as shown in (17). It also follows from (18) that the 
cross and dot may be exchanged at will, provided the cyclical order of the 
three vectors is retained. The parenthesis in a product like A • (B X C) is 
superfluous but it is often written for clarity. 

Because of (15), the scalar triple products of unit vectors all disappear 
except 

[ijk] = -[ikj] = 1 

which follows from (9). If the three vectors A, B, C areTwritten in terms of 
unit vectors and the indicated multiplications performed, the use of (9), 
(10) and (15) gives 

[ABC] = AxByCz + BxCyAz + CxAyBy - AxCyBy - B^A^y - CyByA, 

A~x 

~ Bx By By (4—19) 

Cx Cy C. 

c. The product, V = A X (B X C), called the vector triple product, is 
a vector since it is the vector product of two vectors, A and (B X C). It 
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is therefore perpendicular to both of its components: 

V • A = 0; V • (B X C) = 0 

but V must lie in the plane of B and C, since it is perpendicular to the vector 
product of B and C, which itself is perpendicular to both B and C. 

The most important property of this triple product is that it permits 
decomposition into two scalar products: 

A X (B X C) = B(A . C) - C(A • B) (4-20) 

a relation which may be proved geometrically or analytically by expanding 
in Cartesian coordinates. Since the vector product changes its sign when 
the order of multiplication is changed, the sign of the triple vector product 
will change when the order of the factors in the parenthesis is changed or 
when the position of the parenthesis is changed: 

A X (B X C) = -A X (C X B) = (C X B) X A = -(B X C) X A 

Products of more than three vectors may always be reduced to one of the 
three preceding types of triple products by successive application of the 
above rules. 


Problem. Verify the relations: 

(A XB)- (C X D) = (A.C)(B.D) - (A.D)(B.C) 

(A X B) X (C X D) = B[ACD) - A[BCD1 - C[ABD] - D[ABC] 
[AxB BXC CXA]== [ABC]2 




Ra 



b 

Fig. 4-8 



4.7. Differentiation of Vectors. — If a vector R is a function of a single 
scalar t, which for convenience may be assumed to be the time, there are 
three possible ways in which ’R may vary. Let Ri and R 2 refer to times ti 
and < 2 , then R 2 may differ from Ri: (a) in magnitude only; (b) in direc- 
tion only; (c) in both magnitude and direction as shown in Fig. 8. Since 
no complication arises from treating the general case, let us assume that a 
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curve is traced by the terminus of a continuously varying vector R, the 
origin of the latter being kept fixed at the origin of a coordinate system. 
The vector AR = R 2 — Ri, having the direction of the secant AB of 
Fig. 8, approaches the tangent of the curve C at the point Ri as At = (2 — t\ 
approaches zero. The quotient AR / At is the average rate of change of R 
in the time interval between and < 2 . Following the usual methods of 
differential calculus, the derivative of R is defined as 

lim 

At dt 

In terms of unit vectors, and with the use of primes for differentiation 

R = iRx }Ry + 

R' = ii?' + jfl' + kRi 
R" = iR'/ + jR'' + kii;'' 

For a composite function of two or more vectors, each of which depends on 
the single scalar ty the usual rules of differentiation hold except that, of 
course, the order of the vectors must not be changed in cases involving the 
vector product. 

In the special case of Fig. 8a, where R is constant in direction but vari- 
able in magnitude, AR is parallel to R. Similarly, in case (b), AR is 
perpendicular to R, for the fixed length of R is R • R = d(R • R)/dt = 
0 and hence R • dR/dt = 0, the latter being the requirement that R and 
dlSL/dt be perpendicular. 

4.8. Scalar and Vector Fields. — A scalar field is defined as a region of 
space, with each point of which there is associated a scalar point function 
(cf. sec. 1.7). A simple example is the temperature of points in the atmos- 
phere at a given moment. Qn the other hand, if there is a vector associ- 
ated with each point in a region of space, the points and vectors constitute 
a vector fieldf an example being the wind velocity of points in the atmosphere 
at any instant. 

.Suppose <t>{XyyyZ) is a scalar point function referred to a given coordinate 
system. It will usually change its form if referred to another system, say 
but its value at any point must be unchanged, or For 

example, the temperature at any point in the atmosphere cannot depend on 
the coordinate system used to describe the point. Differentiating = 0 ' 
partially, we obtain 


d<l>^ dx d<t> , dy d<t> , dz 
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dif>' 

w 



d<t> d(l> 

“T 0,22 r 023 — 

dy dz 


d(t>^ 


d<t> ^<t> . 

^31 T" “• ^32 T r Os3 — 

dx dy dz 


with three similar equations for d^/dx^ ^4>l^y, d^jdz. Comparing these 
derivatives with (4) and (5), it follows that (d(/>/da:,d 0 /di/,d 0 /d 2 ) are the 
three components of a vector since they transform from one reference frame 
to another in the manner prescribed for vector components. 

Using the abbreviation 

V - ia/ao: + ja/ay + ka/a^ (4-2i) 

let us study the quantities V ♦ where ^ is either a scalar or a vector and 
(*) is either to be omitted or replaced by a dot or a cross in order to give 
products which have meaning. The operator, V, called del is not a 
vector in the geometrical sense since it has no scalar magnitude, but it does 
transform properly, so that it may be treated formally as a vector. The 
possible products are V<A, where </> is a scalar point function; V • V and 
V X V, where V is a vector field. 

4.9. The Gradient. — The first of these products, called the gradient of 
the scalar <t> 

V0 = grad (t) = id<t)/dx + jdcp/dy + kd<l>/dz (4~22) 


is a vector, since it is the product of a scalar <> and a vector V. To per- 
ceive its physical significance, let us consider the family of surfaces, 
(f>{XjyjZ) = constant, or the equivalent of this relation 

d<t> = {d<t>/dx)dx + {d<t)/dy)dy + {d(t>/dz)dz * 0 


At any point P with coordinates {x^y^z), on one of these surfaces dR = 
idx + }dy + kd 2 is a vector, tangent to P, provided dx, dy, dz satisfy the 
preceding equation. Since V</> • dR = d0 = 0, dR and V</> are perpen- 
dicular to each other, or V<t> is perpendicular to that surface of the family 
which passes through P. By the convention of signs previously estab- 
lished, the direction of V</> is that in which </> is increasing. For any other 
direction determined by the unit vector s with direction cosines 
through P, the component of V</> in the direction s is 

ld<t)/dx + md<i)/dy + nd<i>/dz 

which may be written s • V0. This is the directional derivalive of </» 
in the direction s. In going from P on one of the surfaces (cf. Fig. 9) 
to any point Q on the surface 0 + d</>, the increase in </> is the same wherever 
the point Q is chosen, but the distance PQ will be smallest and hence s • 
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greatest when s is in the direction of the normal N. Therefore, since V0 
is normal to the surface <t> = const., at the point P its direction and magni- 
tude give the maximum space rate of increase of the scalar 0. 



4,10. The Divergence. — The scalar product of the vector operator V 
and a vector V gives a scalar which is called the divergence of V. 

V • V = div V = |i^ + j ^ + Wv + kF,} 

I dx dy dz] 

= dVxIdx -t" dVyIdy + dVzfdz (4—23) 

If V is a vector field, the 
derivative dVx/dx trans- 
forms, when a change of 
coordinate system is made, 
like the product A^Bx of 
the x-components of two 
vectors A and B, hence the 
divergence of V is a scalar 
point function. Suppose 
that V represents at each 
point in space the direction 
and magnitude of flow (den- 
sity times velocity) of some 
fluid such as water or a 
gas, or that it represents 
thermal or electrical flux. Consider, for example (Fig. 10), a small 
parallelepiped of volume dxdydz = dr, through which a fluid is passing. 
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The loss of fluid mass through face A BCD per unit time is 

f dV dx] 

i • \v{x,y,z) + — —\dydz 


while the gain through EFGH is 


i • \W(x,y,z) - 


dW dx 
dx 2 


Therefore the net loss through these two faces is 

. dW 

1 • — dxdydz 
dx 

The losses through the other two pairs of faces are 

dV dV 

j • — dxdydz and k • — dxdydz 
dy dz 

The total loss from the parallelepiped is therefore 
f dV dV dW] 

If V is the velocity of the fluid of density p, V = pv is called the jinx density 
and represents the total flow of fluid per unit cross section in unit time. 
Then if no fluid is created or destroyed within the parallelepiped, this loss 
of mass must equal — {dp/dt)dT^ 

- -dp 


a relation usually called the equation of continuity. If the liquid is incom- 
pressible, dp/dt = 0, hence 

V • V = 0 

A similar relation holds for D, the electric displacement, 

V • D - 0 

4.11. The Curl. — The vector product of V and V is called the curl or 
rotation of V 

jdF, dVy] , ,\dV^ dVA 

eurlV-VXV-.j—-— I + j 

Mini 


dVy dV, 


Odd 


dx dy J dx dy dz 
F, F, 


(4-24) 
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This function may be used to describe the motion of a rigid body rotating 
about an axis with uniform angular velocity, a>. The linear velocity of any 
point P in the body with radius vector R is (cf. 16) L = X R and 

curl L = V X (a> X R) (4~25) 

Expanding (25), by (20) 

curl L = <0 (V • R) — (V • co)R 


Since R iR^ + jRy + = ix + + fcai, V • R == 3. The angular 

velocity is a constant vector, hence V • o) = co • V and we may write the last 
member of the above equation in the form (a> • V)R, which is to be inter- 
preted as the product of a scalar (ca • V) and a vector R. Expanding, 


(a> • V)R 


d d d 

Wa: — + 0)y— + — 


dx 


'dy 


dz 


R = io)x + jcoy + kwg — 01 


Hence, curl L = 3a) — o) = 2o> 

or the curl of the linear velocity of any point of a rigid body equals twice 
the angular velocity, for magnitude, not direction changes. 

4 . 12 . Composite Functions Involving V. — The following relations 
involving V may be verified by expanding the vectors in terms of their 
components along three unit vectors, i, j and k. 

V*(A+J5) = V*A + V*B 

V ♦ (M) = ♦ A + <#>V * A (4-26) 

V(U«V) = (V. V)U+ (U- V)V + V X (V X U) + U X (V X V) 
V.(UXV) = V.VXU-U*VXV 

V X (U X V) = (Y^- V)U - V(V*.U) - (U. V)v+ U(v- V) 

/ • 

In these equations A and B are either scalars or vectors depending on the 
choice of (*),<#> is a scalar and U, V are vectors. If R = ir + jy + kz, 

V • R = 3 
V X R = 0 

U • VR = U 

Problem. Prove eqs. (26). 

4 . 13 . Successive Applications of V. — There are six possible combina- 
tions in which V occurs twice. The following relations may be proved as 
above by expansion in terms of i, j and k. 

a. V • V</> = vV = V • grad </» = div grad <t> 

^ ^ ^ 

~ dx^ dy^ dz^ 

The operator V® is generally called the Laplaaan. 


(4-27) 
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b. Since is a scalar, it may also be applied to a vector, the result 
being a new vector 

V V + - 0. + 


(V • V)V 


= 1 


dx^ 

.d^V.. 


dy^ 

2 + i 


a^F, 


c. 


^ dx^ 


dx'‘ ' ’ dy 

V (V • V) = grad div V 




de 


.2 


+ j 




+ k 


d^V, 

dz^ 


+ ) 


\dxdy 


d. V X V<#> = curl grad 4> - 


dy^ 

fa^F^ ^ a^F 


+ i 


dydz 


ja^F, 

a^F,) 

[dxdy 

dxdz 



[dxdz 

dydz 


= 0 


( 4 - 28 ) 


(4-29) 


(4-30) 


i j k 

a/ax d/dy d/dz 
d<j>/dx dipidy dpfdz 

This is an identity. If for some vector V, V X V = 0, then V = 7<j>, 
where p is some scalar function. Under these conditions, V is said to be 
irrotational. Expansion also yields 

e. V • V X V = div curl V = 0 (4-31) 


Thus if for any vector W, V • W =0 then W = V X V and W is said to be 
solenoidal. 

Finally, the reader will easily check by expansion in rectangular com- 
ponents, the relation 

f. V X (V X V) = curl curl V = 

grad div V - V^V = V(V • V) - V • W (4-32) 


Problem. Show by expansion that eqs. (4-27, 28, 29, 30, 31, 32) are correct. 

4.14. Vector Integration. — As a simple example of vector integration, 
we coi»ider the motion of a particle under the constant acceleration of 
gravity. The equation of motion is 

d^-R/dt^ = G 

where G is a constant vector. Integration results in dR/dt = G< -f Vq; 
R = GtV2 + Vo( + Co, where Vo and Co are the constants of integration 
which are vectors not necessarily collinear with G. They are determined 
by the values of dR/dt and R, respectively, when t = 0. 

More complicated cases may arise, however, for in the general case, the 

integral is J'p* dr, where p and dr may be scalars or vectors, (*) has the 

same meaning as before and the integrals may be multiple. 
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4.16. Line Integrals. — Suppose* - dr is the vector ds, where s = s(t) 
is the equation for a curve. It is then possible to form the integrals: 

(a) f <^s; (h) f V ds; (c) T V X ds (4-33) 

Jc Jc 


each of these being called the line integral along the curve ( 7 . The results 
of integration are respectively, a vector, a scalar and a vector. 

Since 

ds = idrc + ]dy + kdz (4-34) 


the first integral in (33) becomes: 

J '* f* ^ 

4)ds = I 4)(x,y,z)(idx + jdy + kdz) = I </)(x,y,z)idx 

C A 

<i>{x,y,z)]dy + / 4>{x,y,z)kdz 

Vl 


+ 


where A and B are initial and final points of the curve, with coordinates 
and {X 2 ^y 2 j^ 2 )‘ The first integral on the right may be evaluated 
when y and z are known in terms of x for points on the curve C. The 
remaining integrals are determined in a similar fashion. The problem thus 
reduces to the usual line integral in scalar calculus except that it is neces- 
sary to specify the direction in which the radius vector s describes the 
curve during integration, for if the direction A to is taken as positive, 
then 



In case C is a closed curve, the direction is always taken so that the enclosed 
curve appears positive (cf. sec. 4.5). 

No difficulty is experienced in the interpretation of (6) and (c) of (33) 
as the following example shows. Let V = xyi — + xyzk; evaluate 

J V • ds from the point A = (0,0,0) to B = (1,1,1) along the curve 

s = y + 



V-ds 



{xyi — z^j + xyzk) • (idx + jdy + kdz) 
(xydx — z^dy + xyzdz) 


Since ds is the position vector of points on the curve, the coordinates of 
any point in terms of t are 

X = y = z - 
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Hence, 


r V • ds = r 7?dx — r y^dy + T z^dz 

Ja ^0 ^0 


An important special case arises in scalar calculus when the function to 
be integrated is an exact differential, where the value of the integral is 
independent of the path. In vector calculus, suppose 

V = grad (t> = V<t> (4-35) 


with 0 a scalar point function. Then using (22) and (34) 

v<p * as = I ^ 


fv-ds- fv* ds. rp*+«*+i** 

Ja Ja Ja Idx dy dz 

~ S ~ ~ 

If the integration is taken around a closed curve, B = A, then 
J' V</) • ds = ^ </) • ds =0 


(4-36) 


Conversely, if ^ V • ds = 0, then (35) must hold, i.e., V is the gradient of 
some scalar point function </>. We have therefore shown that if V = grad </>, 

the line integral I V • ds depends only on the initial and final values of 

J A 

and is independent of the path. 

4.16. Surface and Volume Integrals. — Let 2 be any surface, divided into 
infinitesimal elements each of which may be considered as a vector, dS. 
The surface integral may then be described as in ordinary analysis, but 
again there are three cases: 

(a) r r (/>dS; (5) r r V • dS; (c) r r V X dS 

u tJ s J Js J Js 

giving a vector, a scalar and a vector. As before, it is important to specify 
the side of the surface over which the integration is performed, for although 
dS is normal to the surface, the signs of the normals on opposite sides are 
opposite. The sign of the normal is uniquely determined by the previous 
conventions except for the case of a one-sided surface^ such as the Mobius 
strip. If the surface encloses a portion of space, dS is taken as the outward 

^ See, for example, Burington, R. S., and Torrance, C. C., Higher Mathematics,*' 
McGraw-Hill Book Co., New York, 1939, pp. 250ff. 
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pointing normal. The surface integral fl V • dS is called the flux of V 

through the surface, for if V is the product of density and the velocity of a 
fluid, the integral is the amount of fluid flowing through a surface in unit 
time. The vector V may also refer to electric, magnetic or gravitational 
force, flow of heat and so on. 

Let dr = dxdydz be an element of volume. Since this is a scalar, there 
are only two possible volume integrals 


(o) 


//X 


Vdr 


the first being a scalar and the second a vector. 

It is often convenient to convert multiple integrals into others with 
fewer integral signs. One possibility has previously been presented in (36), 

namely that the line integral / V • ds may be reduced to the difference 

^ c 

between two scalar quantities, provided V = V0. A line integral may also 
be converted into a double or surface integral by Stokes^ theorem^ or con- 
versely, the double integral may be reduced to a single integral. 

4.17. Stokes’ Theorem. — This theorem may be stated in the form 


fvds^ J J'VXV 


dS 


( 4 - 37 ) 


Conversely, if W = V X V, where V is another vector, then the value of 
the surface integral XX" • dS depends only upon values of V at points 


on the boundary of the surface, 


XX W • dS = ^ V • ds 


The vector V may be taken as flux density of a fluid or as the field of a 
mechanical or electrical force. In the latter case, the line integral repre- 
sents the work done on a particle moving along a curve C. If the curve is 
closed, forming the boundary of a region 2, then according to the theorem 
the work done equals the surface integral of the curl of the force field. 
In the special case where the work done is independent of the path, the line 
integral vanishes so that a requirement for independence of the path is that 
V X V = 0. 

A proof of Stokes’ theorem follows. Consider a surface S bounded by 
the closed contour C. Let C' be the projection of 2 on the XF-plane: 
we are thus associating a point P{x,y) on the plane with every point 
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P' (x,yfZ) of the surface. This means that on the surface, where 2 is a func- 
tion of X and 2 /, a function u{x^y,z) becomes 

u{x,y,z) = (t>{x,y) (4-38) 

since the value of 0 on C' must equal the value of u on C. Similarly with 
other functions 

v{x,y,z) = x{x,z)\ w(xyy,z) = \ly(y,z) 

when projections are made on the XZ- and TZ-planes. We may write for 
the vector defined at each point on the surface, 

V ui + vj + wk 


If we furthermore take a unit vector n, perpendicular to the surface at any 
point, the right-hand side of (37) becomes after expansion 



n • V X Vd>S = 




Xui + VXvj + VX 


(4-39) 


A typical term of (39) may be transformed as follows 


n • V X wi 


du ^ du 

j k — 

dy 


dz 


— n • 



(4-40) 


the second expression coming from (24). The last member of (40) is 
obtained as follows. The partial derivative 


dy 


= J + k — 
dy 


of s = oi + yj + 2 k is a vector, tangent to the curve cut from 2 by a plane 
perpendicular to the X-axis. It is perpendicular to n, hence 


n 


j + k 


dy 


= 0 


(4-41) 


Substitution of (41) and the partial derivative of (38), 

d(j> du du dz 
dy dy dz dy 


into (40), gives the last term of that equation. Since n • kdS = dxdy, we 
may write 



n • V X uidS 



The integral on the right of (42) may be written 




i<t >2 — <t>l)dx 


(4r-42) 



4.18 


VECTOR ANALYSIS 


164 


where <^2 and are the values of <t> at the maximum and minimum values 
of y, 2/2 and yi, respectively. If da is a line element of the contour C'y we 
may write dx = iL{dx/da)d(Ty choosing the sign in accordance with the 
position of da on the contour. Since it is negative at y 2 and positive at yi, 
the integral becomes 


-/ 


i<t>2 + </>l) 



Remembering (38) and the fact that between two points on C' the change 
in X is the same as that between the equivalent points on C, 


so that we finally have 




n • V X uidS 



Similar equations are obtained from consideration of projections of 2 on 
the XZ- and FZ-planes. When they are added together, Stokes’ theorem 
results. 

4.18. Theorem of the Divergence. — A method of reducing triple inte- 
grals to double integrals is offered in the theorem of the divergence, which may 
be written 


1 1 1 


(4-43) 


The Cartesian form of this equation 



u 


(V^dydz + Vydxdz + V^dxdy) 


is often called Gauss^ theorem. Suppose V represents the flux density of an 
incompressible fluid. Then, as we have shown, V * V is the total amount of 
fluid flowing out of a volume dr per second. The total flow from a large 


volume is 



V • Vdr, which must equal the rate of flow across all of 


the surfaces of the volume 



V-dS. 


This proves the theorem.® If 


we assume a steady state, the total amount of flow neither increases nor 
decreases in time and hence must be maintained constant by sources or 
sinks within the region, unless the density of the fluid is continually chang- 


® An analytical proof, which does not depend on the flow of a liquid and is similar 
to the one given here for Stokes' theorem, may be found in books on vector analysis. 



155 


THEOREM OP THE DIVERGENCE 


4.18 


ing (which is contrary to the initial assumption). In view of Gauss^ 
theorem the divergence of the field takes on an interesting meaning. Since 


div V = V • V = lim ^ r f V 

dr— M) dr J Us 


(4-44) 


the divergence is the same as the intensity of the steady flow at a given 
point. This argument may be continued to derive the equation of con- 
tinuity which has been obtained in another way in sec. 4.10. 

A further application of the divergence theorem arises in the problem of 
heat flow. Consider the flow of heat into a thermally isotropic solid body, 
the temperature of which is not the same at all points. The rate of flow 
of heat into the body is* 

- fv-dS 

where V is the flux of heat, the amount of heat which crosses unit area 
drawn perpendicular to the lines of flow per unit time. By Fourier^s law, 
heat flows in the direction of most rapid decrease in temperature, 17, with a 
rate proportional to the thermal conductivity /c, of the solid or 

V = ~kVU (4-45) 

If there are no sources or sinks of heat within the body, and if p is the 
density of the solid and s its specific heat, the amount of heat entering unit 
volume in unit time is 


For the whole body, the heat gained must equal that passing through the 
surface 


r ^ 

I sp — ar = 
Jr dt 

and this becomes in view of eq, (43) 


V-dS 


This equation must hold for every surface, hence 


* Henceforth, single integral signs will be written in multiple integrals when the 
meaning is otherwise clear. 
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Thus because of (45) 


or 


dU 


8p 


dt 


V • (k7U) 


W 

dt 




with = f^/sp and k assumed constant. For a stationary state, V^^7 = 0; 
this is Laplace^s equation, the same law holds for the distribution of 
temperature as for the distribution of potential in charge-free space. 

4.19. Green’s Theorems. — ^The three fundamental relations (36), 
(37) and (43) may be used to obtain a large number of formulas for the 
transformation of integrals, the results corresponding to integration by 
parts in scalar calculus. The two most important such formulas are 
known as Greenes theorems^ when given in Cartesian form. In vector 
notation, these are 


^ V</> • = J* <i>V^ * dS — ^ (pV^ypdr 

= J 4''7<t>-dS- Vt 


(4-46) 



4^V^4>)dT = f - ^V(^) • dS 


(4-47) 


Green’s first theorem is easily found by substituting V = </)ViA in (43). 
The second theorem is obtained by interchanging and ^ in (46) and sub- 
tracting the result from (46). 

Problem. Verify eqs. (46) and (47). 

4.20. Tensors. — In many physical problems, the notion of a vector is 
too restricted. For example, in an isotropic medium, stress S and strain X 
are related by the vector equation S = fcX, X and S having the same direc- 
tion. If the medium is not isotropic, S and X are not in general in the 
same direction; it is then necessary to replace the scalar /fc by a more 
general mathematical construct capable, when acting on the vector X, 
of changing its direction as well as its magnitude. Such a construct is a 
tensor, A similar generalization has to be made in the '^(pctor eqpations 

P - cE 

where P and E represent electric polarization and field strength, 

I 
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where I and H represent intensity of magnetization and field strength; for 
anisotropic media, the susceptibilities e and /i must be replaced by tensors. 

Again, if it is desired to represent the displacements 6v of the points 
in a strained elastic medium as functions of their position vectors v, a 
tensor equation of the form 5v == /v is needed, for 6v and V differ in direc- 
tion, and the tensor t must effect this difference. This example will be 
treated in detail in sec. 4.23; but first we shall discuss the analytical 
properties of tensors. 

Let us consider for complete generality a space of v dimensions and 
assume that two different reference frames are given so that a point whose 
coordinates^ in the first one are • •,x‘') has the coordinates 

• •,F) in the second system. Further let there be relations 

^ (4-48) 

m - 1, 2, 3, • • 

so that wo may transform from one system to the other. Then if v quanti- 
ties • •,A*') are related to v other quantities (A^,A^,- • •,A*') by 

the equations 

l” = E -V = 1, 2, • • •, V (4-49) 

dx 


they are said to be the components of a contravariani vector or a tensor of 
the first rank. To simplify the notation, it is customary to omit the summa- 
tion sign and sum over indices which are repeated on the same side of the 
equation. An index which is not repeated is understood to take succes- 
sively the values 1, 2, • • •, so that there are altogether r different equations. 
With these conventions, we may rewrite (49) as 

I"* = — T (4-50) 


A further word about notation should be added. Since a repeated index 
(it is often called a dummy or umbral index) indicates summation, another 
letter may be substituted for it at will. Thus (50) may also be written 


A^ 


dx^ . dxT 
A^ = — A 
dx^ dx^ 


n 


etc. 


We will often use the same symbol such as A Ho indicate both the tensor 
and the t-th component of a tensor. No confusion should result from this 
arrangement. 

^ The upper suffix is not an exponent. Its position has an important meaning as the 
subsequent discussion will show. 
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A covariant vector with components Am in one system and Am in another 
is defined by the relation 

(4-61) 

If (48) is differentiated we obtain 

= — 7 - dx^ (4-51a) 

dx^ 

hence we see that the components of an ordinary vector in v-dimensional 
space are actually the components of a contravariant tensor of rank one. 
To find an example of a covariant vector consider a scalar point function 
(p(x^) = The components of the gradient of cp will be d<p/dx^ and 

dip dx^ 
dxT ~ dx^^ 


Thus the gradient of such a function is a covariant vector. The reader 
should not conclude, however, that a covariant vector is necessarily the 
gradient of a scalar. 

These ideas may be extended easily to define tensors of any rank. 
If <p{x^) = (p{x^)y we speak of as a tensor of zero rank or a scMar or 
invariant. There are three varieties of second rank tensors® defined by the 
transformations 


Jmn 


dx'^ dX^ 


(4-52) 


I 


m 

n 


dx'^ dx^ 

dx^ .j 
dx' dx” ’ 


(4-53) 

(4-54) 


They are called contravariant, co variant and mixed, respectively. A use- 
ful mixed tensor of the second rank is the Kronecker delta 

5^ = 1; m — n 

= 0; m ^ n (4-‘55) 

This is seen as follows. Suppose 5} is this tensor in the coordinate system 
then from (64) 

- dx^ i dx^ dx^ 

= 6, = : 

dx^ dx^ ' dx^ dx^ 


* Tensors of the second rank are also called dyadics; see Gibbs-Wilson, loc. cit. 
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We thus see that has the same components in all coordinate systems. 

Tensors of higher rank are defined by similar laws, for example, a 
mixed tensor of rank four is 


^ ^ ^ ^ 

~ ax* ax” dxp ax« 


(4-56) 


It should be noted that if v is the number of dimensions of the coordinate 
system, then a tensor of order a has a*' components. j 

Problem. Show that, if ccj. (48) represents a rotation of Cartesian axes in ordinary 
(or in general homaloidal) space, 

dx^ dx^ 


so that there is no distinction between a covariant and a contra variant vector in ordinary 
space. 

^^.21. Addition, Multiplication and Contraction. — The sum or difference 

of two or more tensors of the same rank and type is a tensor of the same 
rank and type. For example, if 


jgmn 




it follows from (52) that is a tensor. It frequently happens that the 
components of a tensor satisfy the relation 


such a tensor being called symmetric. On the other hand , if = —A”'”, 
the tensor is skew-symmetric. Wlien neither of these relations holds, a 
given tensor may always be written as the sum of a symmetric and a skew- 
symmetric tensor. To see this let us take 

where A^" is neither synunetric nor skew-symmetric. Then 


The property of being symmetric or skew-symmetric is unaltered when a 
tensor is transformed from one reference frame to another. 

An important relation between vectors and skew-symmetric tensors is 
easily verified. Suppose A and B are two vectors in a three-dimensional 
rectangular coordinate system whose components are connected as shown 
in eq. (4) or (5) : 

Bx = ttiiAic + ai2Ay + ^13-4, 

By = (l2\Ax + (l22Ay + (l2zAz 
Bz = dziAx + o.z2Ay + azzAz 


(4-57) 
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Now if the coefficients a,/ were the components of a skew-symmetric tensor, 
we would have = — ay^; an = 0 ; hence 

Bx = —a2iAy + ciisAg 

By = a 2 iAx — a^ 2 Az (4-67a) 

Bg = —ai^Ax + CL32Ay 

Comparison with (14) shows us that we could also write 

B = T X A 

where T is a vector with components Tx = 032 ; Ty = ais; Tg = a 2 i. 

In vector analysis it is often the custom to distinguish between a polar 
vector and an axial vector. The former is always used to represent trans- 
lations and mechanical forces while the latter is connected with a rotation. 
The typical axial vector is the vector product of two polar vectors, as B. 
It may be proved that the coefficients a^ of (57a) are the components of a 
tensor® hence we see that an axial vector is really a skew-symmetric tensor. 

It seems well to remark that tensors and matrices are intimately related. 
Thus if we write (57) in matrix form and compare it with eq. (10-16) it is 
clear that the components of the tensor are also the elements of a matrix. 
The only difference lies in the fact that tensors may always be written in 
matrix form if so desired, but the elements of a matrix do not need to 
transform in the same manner as tensors. 

If we multiply A^ by Bn we obtain the mixed tensor A^Bn = Cn- It 
is easily seen that Cn transforms like (54). This type of product, called the 
outer product, may be obtained with tensors of any rank or type; thus 
AnBpq = Cnpq- It should not be inferred, however, that every tensor can 
be written as a product in this way. Neither should we conclude that the 
outer product is the same as the vector product of sec. 4.5. 

Let us set m = q in the mixed tensor of (56) and write Bnp — A^y^. 
To show that our notation, which indicates that Anpm is a covariant tensor 
of rank two, is justified we use the transformation law (56), 

“ _ Jm dx^ dx^ dx^ . 

Comparison with (53) convinces us that A]ki is indeed a co variant vector 
of rank two since it transforms in the required way. This process of 
summing over a pair of contravariant and covariant indices is called 

® The proof is given by Abraham, M., and Becker, R., The Classical Theory of 
Electricity and Magnetism,” Blackie and Son, London, 1932. 
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contraction. It always reduces the rank of a mixed tensor by two, thus 
when it is applied to a mixed tensor of rank two the result is a scalar : 


A 


m 

m 


dx^' 

dx' dx” •’ 


A\ 


A 


m 

m 


When two tensors are multiplied together and then contracted, we speak of 
inner multiplication, thus 


A'^'^Bnpq = Cpq] A'^Bm = a scalar 

The last example is clearly equivalent to the scalar product in rectangular 
coordinates (cf. sec. 4.4), hence in tensor analysis, we say that if I is the 
length of A^ or Am 

f = A”'A„ (4-58) 

From (8) we conclude that the angle 6 between two vectors Am and Bm is 
defined by 

and if Am and Bm are perpendicular to each other, 

AmB^ = 0 

We have just shown how new tensors may be obtained by addition, 
multiplication and contraction. We now inquire whether it is possible to 
change contravariant tensors to covariant ones or the reverse. Let Qmn 
be any symmetric covariant tensor and g be the determinant of the com- 
ponents of gmn- Also let be the co-factor^^ of gmn in g, then if we define 


g 


Qmn 

g 


it follows from the rules for the expansion of determinants that 

gmug^"" = Sm 


(4-69) 

(4-60) 


We would like to justify our notation and prove that g^^ is actually a 
tensor. Let be a vector, then Bm = gmnA"^ is also a vector, moreover 


g^-Bm = g^^gmpA^ = = A" 


so that g^^ changes a covariant vector into a contravariant one; hence it 
must itself be a tensor. 

Two vectors related by the equations 


A^ = g^^An 
or 

Am “ gmnA 


Note that is not a tensor. See sec. 10.3 for discussion of determinants 8md 
sec. 5.16 for further properties of these tensors. 



4.22 


VECTOB ANALYSIS 


162 


are called associated. It is often said that both are the same vector, 
being the contravariant components and Ajn the covariant ones. Tensors 
of any rank may be treated in the same way, thus 

It should be clear that 


= Am^B^n 

Because of the fact that dummy indices may be changed from one letter to 
another at will it follows that they enjoy a certain freedom of motion. 
They may be raised in one place if they are lowered in another. We have 
indicated this procedure in the last equation by spacing the indices. Such 
information is needed, for it is not true that 

Am" = gmpA^'' and A^m = QpmA^^ 

are identical unless A is a symmetrical tensor. 

y 4.22. Differentiation of Tensors. — It has been shown in sec. 4.20 that 
the derivative of a scalar point function is a covariant vector. The deriva- 
tive of a covariant vector is not a tensor, however, for if 


dA„ ^ dx^dAn 

dx” d2”dx”' * dx”' ax’* 


dx’' dxP dAk 
dx’^dx”' * ax’" ax" ax'* 


(4-61) 


and the presence of the second derivative shows that dAm/dx^ does not 
transform like a tensor. In order to find a derivative of th^ proper 
tensor character we first rewrite the second derivative in terms ^of first 
derivatives. To do this let us use the two tensors gij and defined previ- 
ously. Let us further introduce the following quantities (they are not 
tensors) called the Christoffel three-index symbols 


[mn,q] = 

\mn,q] = ^ 

I m sy ax’" ax* / 


the significance of which will soon be evident. From these definitions we 
see that 


{mn,q\ = ^®*[mn,s] 


(4-64) 
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According to (53), we have 


_ dx* dx’ 


and it is also true that 

^ 

~ dx^ dx« 

Differentiating (65) and using (66) we get 
dgmn ( dx’ , dx' dV \ 


(4-65) 


(4-66) 


— = / - - — - - . I 

ax« Vdx’ax” dx" ax'" ax«ax"/ ax’" ax" ax 
In the same way if we differentiate and g^g we obtain 


\ ax* dx^ ax* ag,y 

r) + 


-««(; 


6x 

.dx^'dx" ^ ax 


3V ^ ^ 

i"ax’" ^ ax^ 


x" ax’"axV 

x*" ax"ax''/ 


ax* dx^ ax* a^ji; 
ax’" ax" ax« ^ 

ax* ax^ ax* ag,* 
ax’" ax" ^ ^ 


(4-68) 


(4-69) 


We may exchange i and j in the second term on the right of these expres- 
sions. If we add (68) and (69), subtract (67) and use eq. (62) we obtain 

, a^x* ax-* ax* ax-* ax* , . . , , 

ax’"ax" ^ ^ ^ ^ 

where the bar over the Christoffel symbol indicates that it refers to the 
coordinate system x^. Now multiply this equation by {dx^ / dx'^) and 
use (64), which gives 

( j dx^ d^x^ dx^ dx^ 

— fifty (/ 


dx^ dx** 


. -«r dx^ dx* dx\ . . _ 

^ ^ ^ ^ ^ 

By means of (52) we may eliminate from the right-hand side of this 
equation to obtain 

. d^X* dx* dx^ )t/ir • • 7 1 
dx”'dx" ax’" ax" ^ 

Finally, remembering that = gt = Sj, we see that 

a^x* , , ax* ax* ^x' . , 

Siw ■ i’"”-’'! 5? ■ ii" 5? 

Let us put this result into ( 61 ) which then becomes ' 

al„ r, — . dx* dx* ax^,..,,l . ,dx* ax'aA.- 


r, , ax* ax* ax^ , .. ,1 

- [I’"”-’-! 5 ? “ U 


, ax* ax' aA.- 



4.23 


VECTOR ANALYSIS 


164 


where we have changed the dummy indices in the last term from h and p to 
i and j. Finally we see from (51) that we have 

dx 


80 that (70) may be written 


dAm 

dx” 


- = — 



Now if we use the comma abbreviation 

Aij - ^ - {ijMAh 

it follows that 

j dx^ dx^ 


hence this quantity is a covariant tensor of the second rank. It is called 
the covariant derivative of Ai with respect to gij. 

In a similar way it may be shown that the covariant derivative of 
with respect to gij is 

dA* 

a:, ^-^ + {jh,i\A>^ (4-71) 

Problem a. Prove that [mn,p] = gpg { mn,q ) . 

Problem b. Show that second derivatives of tensors may be derived in the form 

dA • 

Aij\k “ Aih{jkyh\ Ahj{ik}h\ 

= ^ + +A«|Wfe,t) 

OX 

AU = + A^\hk,i\ - 

X 4.23. Tensors and the Elastic Body. — As an example of the use of the 
tensor in a physical problem let us consider a deformable body subjected 
to an infinitely small deformation or strain. Let Pq be a point of the 
in the unstrained state and let P be its deformed position. If the 
♦of i^nd P are Xq and x^ then the components vHq of the 
displacement yoctor will be 

^ - xo ^ (4-72) 

Suppose Qo is a neighboring point as shown in Fig. 11 which is deformed 
to the position Q. Now if the components qf the vectors PqQo ^nd P^ 
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are vl and if, the coordinates of Qo and Q will he 3^0 + i/q and :d' + if . 
It follows'^ that 


(a:’' + dO - (a^ eS) = 
and, on using (72), that^^ 


fej/' 


du^\ 

O 


(4-73) 

(4^74) 


The coefficients {du^/ 6 x ^)0 which relate the two vectors dv^ and are the 
components of a tensor. The terms (du^/dx^)y (du^/dx^)y (du^/dx^) 


Q 



are tension strains parallel to the axes x^, x^, respectively. The 
remaining terms are shearing strains about these axes; for example, 
{du^/dx^ + du}/dx^) is the shearing strain about the axis perpendicular to 
x^ and x^. 

If the nine components of the tensor are written out it will be seen that 
it is not in general symmetric. However, it can be made so as shown in 
sec. 4.21. Dropping the zero subscripts from (74) we write 

5 v^ = fy = ey + (jy ( 4 - 75 ) 

where 

t: = {duWdx^); e:=j^{f, + Q 

0 ^: = “ O (4-76) 

The coefficients el are now the components of a symmetrical tensor, which 
is called a pure strain. It may be shown (see problem at end of this 

The zero subscript on the derivative is meant to indicate that it is evaluated at the 
point Pq. 

This result only holds for rectangular coordinates. If (74) is to hold in generalized 
coordinates, we must use the covariant derivative of u^. 
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section) that represents a rotation of the neighborhood of Pq about P. 
We could also add to (75) a translation by the amount a**, so that 

bv^ = a" + ey + coy 

represents the most general displacement of an elastic body, the total 
motion being composed of: (1) a translation, (2) a pure strain, and (3) a 
rotation. 

This brief discussion of tensors is entirely inadequate to indicate its 
great value in mathematical physics. The subject has been most fre- 
quently employed in the general theory of relativity. It may also be 
applied with advantage in the study of dynamics, electricity, and hydro- 
dynamics.^^ The material presented here is sufficient for the use which 
will be made of tensors in this book. 

Problem. Show that the tensor wj represents a rotation. 

Hint: Write out the components of (76) and it will be seen that the resulting vector 
is the vector product of two other vectors. 

Eddington, A. S., The Mathematical Theory of Relativity, Second Edition, 
Cambridge Press, 1930. 

These subjects have been so treated by McConnell, A. J., “ Applications of the 
Absolute Differential Calculus,” Blackie and Sons, London, 1931, and more briefly by 
Thomas, T. Y., “ The Elementary Theory of Tensors,” McGraw-Hill Book Co., New 
York, 1931. See also Kron, G., “ Short Course in Tensor Analysis for Electrical Engi- 
neers,” John Wiley and Sons, New York, 1942. 



CHAPTER 5 


COORDINATE SYSTEMS 

VECTORS AND CURVILINEAR COORDINATES 

6.1. Curvilinear Coordinates. — Although the methods of vector analysis 
prove convenient in the statement of physical laws, it is usually necessary 
to rewrite the vector equations in terms of suitable coordinates before the 
final solution of a specific problem can be obtained. It is the purpose of 
this chapter to show^ how the components of vectors or vector operators 
may be formulated in a system of curvilinear coordinates^ the latter being 
of so general a nature that it is an easy matter to transform from them to 
any one of the several kinds of special coordinate systems which have 
been found useful in physical problems. 

In Cartesian coordinates, the position of a point P{x,y,z) is determined 
by the intersection of three mutually perpendicular planes, x = const., 
y = const., z = const. When x, y and z are related to three new quantities 


by the equations 





X 

= *(91)92,93) 



y 

= y{yi,q2,yz) 

(5-1) 


z 

= 2 ( 91 , 92 , 93 ) 


with inverses, 





9i 

= 9l(*,J/,2) 



92 

= 92(*,2/,2) 

(5-2) 


93 

= 93(*,2/,2) 



a given point may be described by specifying either x, t/, 2 or gi, 52 , Qsj for 
each equation of (2) represents a surface and the intersection of three such 
surfaces locates the point. The surfaces qi = const., q 2 = const., 
qs = const, are called the coordinate surfaces; the space curves formed by 
their intersection in pairs are called the coordinate lines. The coordinate 
axes are determined by the tangents to the coordinate lines at the inter- 
section of three surfaces. They are not in general fixed directions in space, 
as is true for simple Cartesian coordinates. The quantities ( 51 , 92 , 93 ) are 
the curvilinear coordinates of a point P(x,t/,z). 

^ The relations which we derive here may be obtained in other ways; see sec. 5.16 
and Hobson, E. W., ** The Theory of Spherical and Ellipsoidal Harmonics,” Cambridge 
Press, 1931. 


167 



6.1 


COORDINATE SYSTEMS 


168 


From (1), 

dx = — dgi + — dg2 + — “93 

agi 3^2 ^93 

ay , . . ^2/ j 

dy = —dqi + — dq^ + — “93 
agi ag2 5g3 

dz = ■— dqi + - — dq 2 + r dq^ 

dqi dq2 dqs 

hence the square of the distance between two adjacent points, 
ds^ = dx^ + dy^ + dz^ = Qhdql + Ql2dq2 + Qlsdql 
+ 2Qi2dqidq2 + ^Qi^dqidqz + 2Q2zdq2dqz 


where, 


dx dx ^ dy dy ^ dz dz 
dqidqj dqidqj dqidqj 


■ (S) (S) + (S 


;y 


(hJ — 1>2,3; i 7^ j) 


(5-3) 

(5-4) 


For convenience we shall hereafter omit a repeated subscript, writing for 
instance Qi instead of Qu. 

The distance between two points on a coordinate line is called the line 
element. It is given by eq. (3) when variation is limited to only one of the 
g^s, 

dsi = Q^dqi (i = 1,2,3) (5-5) 


The direction cosines between these line elements and cfe, dy or dz may be 
arranged as shown in Table 1 of sec. 4.1; for example, the cosine of the 
angle between dsi and dz is {dz/dq\){dqi/d8i) = {dz/dqi)/Qi. The cosine 
of the angle 6ij between dsi and dsj from (4.1) is 

cos dij = Qij/QiQj 

The most useful coordinate systems are orthogonal ones, that is, systems in 
which surfaces always intersect at right angles. We shall limit ourselves 
to such systems in secs. 5.2 to 5.15, returning to the more general 
case of non-orthogonal systems in sec. 5.16. For the present, then, 
cos Oij = 0, Qij == 0, and the cross product terms may be dropped from 
(3). The three possible surface elements in orthogonal systems thus become 


dSij — dsidsj — QiQjdqidqj (ij — 1,2,3; i ^ j) (5-^) 


and the volume element^ 

dr = dsids 2 dsz = QiQ 2 Qzdqidq 2 dq:i 


(5-7) 
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6.2. Vector Relations in Curvilinear Coordinates. — If (#> is a scalar point 
function, V0 must be the same in all coordinate systems, for is a vector 
whose magnitude and direction give the maximum space rate of change of 
0. A component of is its directional derivative (see sec. 4,11) in the 
given direction, thus the component perpendicular to the surface qi = con- 
stant and hence in the direction of Si is 

d</> 1 d(l> 

dsi Qidqi 


in accordance with eq. (5). Since it is also possible to regard V as a vector 
operator, it may be written in terms of unit vectors, Ui, U2, Ua along the 
curvilinear coordinate axes. Thus, 


so that 


Ui ^ U3 ^ 

Qi Q2 Q3 dg3 

Ui d(t> ^ ^ ^ ^ U 3 

Qi^Qi Q2^Q2 


(5-8) 

(5-9) 


Any vector may be written in terms of curvilinear components 7i, 

Tr . 




V = UiFi + + UsFs 


(5-10) 


but in order to find V * V (see sec. 4.10) in curvilinear coordinates, we must 
know the relation between Ui , U2, U3 and x, y, z. We proceed by evaluating 
V * u„ starting with V X u„ since this is needed to obtain V • u,-. 

Remembering that Ui/Qi is the product of a scalar and a vector, we 
may write in view of (4-26) 


the change of sign coming from the change of order in the vector product. 
From (9), we note that Ui/Qi = Vgi and from (4-30) that 


hence, 


V X Vgi = 0 


(6-12) 


Now using (8) and performing the differentiation, we find 
/ 1 \ _ — Ul dQi ^ U 2 ^Qi _ ^3 ^Qi 

\Qi/ Qi ^Qi Q1Q2 dg2 Q1Q3 dg3 


(6-13) 



6.2 


COORDINATE SYSTEMS 


170 


(5-15) 


When we further recall that 

Uj X u, = 0; MiX uy = u* (6-14) 

and substitute (13) in (12), we obtain 

„ '^2 dQi U3 dQi 

V X Ui = — (5-15) 

Q 1 Q 3 ^Qz Q 1 Q 2 »?2 

The scalar product of V and a unit vector may be written as 

V • Ui = V • (U2 X U3) = U3 • (V X U2) - U2 • (V X U3) (5-16) 

by using (14) and (4-26). This becomes 

1 d(Q2Qz) /, 

when we expand the vector product by (15) and use the fact that 

u, • Ui = 1; Ui • Uy = 0 (5-18) 

In order to determine V • V in curvilinear coordinates, we see from (10), 
that 

V * V = V • (uiT^i) + V * (1121^2) + V • (U3F3) 
a typical term becoming 

V • (UiFi) = FiV • Ui + Ui • VFi (5-19) 


(5-17) 


by (4-26). When V • Ui is written in the form of (17), VFi in the form of 
(9) and (18) used to eliminate the scalar products of the unit vectors, the 
three terms of (19) reduce to 


V- V = 


Q1Q2Q3 


^{v,Q 2Q3)+4- (y^QiQ^) 


+ — iV^QM 


(5-20) 


If V = V</>, 


V ■V4> = 


QiQ2Q3l^9i 

^ r QiQz d<i> ~j 
^92 1 Qz 592J 


I 9 r Q2Q3 1 

3U91L Qi ^ 9 iJ 

1 , r Q 1 Q 2 

J 593L Qz 3^3 Jj 


(5-21) 


since the components of 7<t> are F,- = id<f>/dqi)/Qi. 

The curl of a vector in terms of the unit curvilinear vectors becomes 


V X V = V X (uiFi) + V X (U2F2) + V X (U3F3) 
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which may be expanded by using (11), to give terms like 
V X (UiVi) = y»(V X u,-) ~ xxi X (Wi) 


When three similar equations are added together, the result in determi- 
nantal form is 


V X V 


1 

Q1Q2Q3 


QiUi Q2U2 Q3U3 
d/dqi d/dq 2 d/dq^ 
ViQi V 2 Q 2 ^ 3^3 


(5-22) 


In order to compute V^V in curvilinear coordinates, use is made of the 
relation (4-32). 


VW = V(V • V) - V X V X V 


which may be reduced to the desired form by means of (8), (20) and (22). 
The component of the resulting expression along the Ui direction is given by 


1 d 
Qi^qi 


^ (V X V)2 (5-23) 

Q2 oq2 Qs dq^ 


where (V X V)3 and (V X V)2 are the components of V X V along U3 
and U2. The two other components of V^V are obtained from (23) by 
cyclic permutation of the subscripts 1, 2, 3. 

The task of computing any of these vector quantities in special coordi- 
nate systems is seen to involve calculation of the Q,- which may be done in 
a straightforward way from (4) provided relations like (1) or (2) are known. 
In the remainder of this chapter we discuss those special systems which 
appear to be most useful. We include all those which may be used to solve 
the three-dimensional Schrodinger wave equation of quantum mechanics. 
It has beim shown^ that the method of separation of variables (cf. Chap- 
ter 7) is applicable to this equation only if the potential energy is of the 
form 

= Qifiqi) + Q 2 f(q 2 ) + Qsfiqz) (6-24) 

and the coordinates have certain special properties. There are eleven 
such systems; these are the ones described in secs. 5.3~5.9, 5.11-5.13 and 
the confocal ellipsoidal system of sec. 5.6 expressed in terms of elliptic 
integrals. We indicate other examples of the use of some of the systems as 
we proceed. In each case, we describe the geometry, give the relations 
between the new coordinates and Xy y, z and list the resulting Q,- obtained 

^ Robertson, H. P., Malh. Ann, 98 , 749 (1928); Eisenhart, L. P., Phya. Rev, 45 , 
427 (1934). 
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from (4). Calculation of V<#>, V X V, etc., may be performed as an 
exercise by the student^ (see problems in later sections). 

SPECIAL ORTHOGONAL COORDINATE SYSTEMS 

6 . 3 . Cartesian Coordinates. — ^These form a trivial case of curvilinear 
coordinate systems. 

Qx - Qy - Qf = 1 (^-25) 

6 . 4 . Spherical Polar Coordinates. — The coordinate surfaces are families 
of: (1) concentric spheres about the origin (r = const.), (2) right circular 
cones with apex at the origin and axis along z {0 ^ const.), (3) half-planes 

Z 



through the Z-axis {<t> = const,), A point P{x^yyZ) is located by specifying 
the radius r of the sphere on which it lies, its colatitude B, and its longitude 
or azimuth 0 on the sphere. From Fig. 1, it follows that 

X = r sin 6 cos </> 

y = r sin ^ sin 0 (5-26) 

z = r cos 6 

® Some of these quantities for certain of the systems may be found in Pauling and 
Wilson, Introduction to Quantum Mechanics,^' Appendix IV, McGraw-Hill Book Co., 
New York, 1935; see also, Adams, E. P., Smithsonian Mathematical Formulae,” 
Washington, 1922. 
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Remembering that dsi = values of the Qi may also be determined by 
inspection from the figure, thus 

Qr = 1; Qj = r^; Ql = sin^ S (5-27) 

6.6. Cylindrical Coordinates. — ^The coordinate surfaces are: (1) right 
circular cylinders which form families of concentric circles about the 
origin in the XF-plane (p = const.); (2) half-planes through the Z-axis 
(</> = const.); (3) planes parallel to the XF-plane {z = const.). A point 


Z 
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sheet (fjL = const.); 
the equations 


(3) hyperboloids of two sheets (v = const.) given by 



'(5-30) 


where X, v are parameters called ellipsoidal coordinates) a, 6, c are con- 
stants; a^>p>h^>n>c^>\>~oo. It is shown in books on solid 
analytical geometry that intersections of these three surfaces are orthogo- 
nal and that all of them have common foci. Moreover, through any fixed 
point P{x^yjZ) there passes one and only one surface of each type. 

The relation between the new and the old coordinates may be found by 
solving (30) directly. It may be done more easily as follows. Consider 
the cubic equation in a parameter q 


-- + 




6" - 


+ 


- 


1 = 0 


(5-31) 


with three real roots, X, /x, v satisfying the inequalities just stated. As q 
varies between and — oo, (31) describes the complete system of confocal 
surfaces given in (30), On clearing (31) of fractions and eciuating it to its 
identity, we have 


X^ib^ — q){c^ - 9) + y^{a^ - g)(c^ - q) + - q)(b^ - q) 

-(a^ - - q)(<^ - q) = (9 - x)(<? - m )(? - »') = o (5-32) 


and this must hold for every value of q. Upon setting q = o^, b^, ii 
turn, we obtain 


.2 _ ~ — v) 

(6^ — a^)(c^ — n^) 

2 = ( 6 ^ - X)( b^ - M)(b^ - v) 
(a^ — b^){c^ — f)^) 

,2 ^ (c^ — X)(c^ — m)(c^ — v) 
(a^ — c^){b'^ — c^) 


(5-33 


Taking the logarithm of (33), differentiating partially with respect to 
and using (4), we have 



(a^ - n)(a^ — v) (b^ _ ,,)({>2 _ 

(a^ - X)(6" - a^)(c^ - a^) (6=* - \){a^ - 

(c^ - m)(c" - v) ] 

^ (c^ - X)(a2 - c2)(63 - c2) I 


- b'*) 
(5-3 
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Values for Ql and may be obtained in a similar way or from (34) by 
cyclic interchange of Simplification of the resulting expressions 

yields 




Ql = 


Ql = 


(m — X)(?/ — X) I 
■ (a2 - x)(6‘^ - X)(c2 - X)J 

I (^ - m)(x - m) I 
l(a2 - ;x)(52 - m)(c2 - m)J 

I (X — y)(ju — v) I 
l(a^ — v){b^ — v){c^ — v)] 


(5-35) 


It is somewhat laborious to transform (34) directly into the first equation of 
(35) but their ecjuivalence may be verified by writing the latter in terms of 
partial fractions. 

Because of the fact that x, 7j and z appear as squares in (33), a given 
point P(x, 2 j,z) is not uniquely determined by (X,fx,v); in fact, eight points 
symmetrically located relative to the (XFZ)-axes correspond to the set 
(Xyix^p), This ambiguity may be resolved by adopting some convention 
concerning the signs of (X,/li,v), or in more elegant fashion by the intro- 
duction of elliptic functions. The latter procedure may be accomplished 
either by means of elliptic integrals, Jacobian elliptic functions or Weier- 
strass p-functions.'^ 

The confocal ellipsoidal coordinate system has proved useful in prob- 
lems of mechanics, potential theory, electrodynamics and hydrodynamics.® 

5,7. Prolate Spheroidal Coordinates. — Degenerate cases of the preced- 
ing system may arise if two or three of the axes in (31) become equal. 
Additional surfaces are then needixl since the resulting equation in q is 
either quadratic or linear. Instead of following a method similar to that 
used for ellipsoidal coordinates, it is simpler to proceed by considering the 
equations of an ellipse and a hyperbola. 




€? a^{el — 1 ) 


(6-36) 


* Full details concerning these functions may be found in Whittaker, E. T. emd 
Watson, G. N., “ A Course of Modem Analysis,” Fourth Edition, Cambridge Press, 
1927. 

® Some references to these applications are: MacMillan, W. D., ‘^Statics and 
Dynamics of a Particle,” 1927, ” The Theory of the Potential,” 1930, McGraw-Hill Book 
Co:; Kellogg, O. D., ” Foundations of Potential Theory,” J. Springer, Berlin, 1929; 
Mason, Max and Weaver, Warren, “ The Electromagnetic Field,” University of Chicago 
Press, 1929; Milne-Thomson, L. M., ‘‘Theoretical Hydrodynamics.” Macmillan and 
Co.. London, 1938. 
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where a is the semi-major axis and ei < 1 is the eccentricity of the ellipse, 
62 > 1, the eccentricity of the hyperbola. If we now substitute a cosh u 
for a and sech u for ei in the ellipse, a cos v for a and sec v for 62 in the 
hyperbola and finally for we obtain 


+ 


cosh^ u c? sinh^ u 


= 1 


a? cos^ V 


c? sin^ V 


= 1 


(&-37) 


with ^ ^ u ^ 00 , 0 :^y^ 7 r. These equations represent the confocal 
families of: ( 1 ) prolate spheroids® {u = const.) and ( 2 ) hyperboloids (of 
two sheets) of revolution (v = const.) obtained by rotating the ellipses 
and hyperbolas of (36) around the Z-axis. The intersection of these 
surfaces, as shown in Fig. 3, will be a circle of radius r; hence if 0 < ^ 27 r, 

the addition of (3), a family of planes through the Z-axis (0 = const.), to 
the spheroids and hyperboloids gives us three suitable coordinate surfaces 
(w,t;, 0 ). We may then solve (37) for z and r and simplify the resulting 
expressions by means of the relations between trigonometric functions. 
Finally, we set x = r cos </>, y = r sin obtaining 

X = a sinh u sin v cos 0 
y ^ a sinh w sin t; sin <#> 

2 = a cosh u cos v 

and from (4), 

(sinh^ u + sin^ v) 

Ql = (sinh^ u sin^ v) 

An important property of prolate spheroidal coordinates makes them 
useful in certain quantum mechanical problems. It is well known from 
analytical geometry that the sum of the focal radii of an ellipse is a constant, 
equal to the major axis. Similarly the difference between the focal radii 
in a hyperbola equals the transverse axis. If and tq are the distances 
from the two foci to a point of intersection of the ellipsoids and hyper- 
boloids, we find that 


(5-38) 


(6-39) 


= 2 a cosh u] — tb ^ 2a cos v 

where we have replaced a by a cosh u and by a cos v as before. This pro- 
cedure thus locates a point relative to any two-center problem such as the 
diatomic molecule (see sec. 11.21). It is often convenient to introduce the 

® Also called ovary ellipsoids. 
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coordinates { and i? in place of cosh u and cos p, respectively, so that 


^ rA + rs . ^ Ta - rs 

2a * ^ 2a 


(6-40) 


In terms of these variables, the volume element may be seen to take the 
form 


dr = — ri^)d^dr]d<l> 


6.8. Oblate Spheroidal Coordinates. — When ellipses are rotated about 
their minor axis, the resulting surfaces are oblate spheroids/ If we rewrite 
(37) so that the axis of revolution is again the Z-axis, but now^ the minor 
axis of the ellipse, we have 


+ 


a^ cosh^ u o? sinh^ u 


= 1 




o? sin^ V 


0? cos^ V 


= 1 


(5-41) 


with oo,0:^t;^7r, x=r cos (f), y = r sin 0, 0 ^ ^ 2t. The 

coordinate surfaces are thus: (1) oblate spheroids (u = const.); (2) hyper- 
boloids (of one sheet) of revolution {v = const.); (3) planes through the 
Z-axis (</) = const.). From (41), we find 

X = a cosh u sin v cos <t> 
y = a cosh u sin v sin </> 
z = a sinh u cos v 

and from (4), 

Qu = (sinh^ u + cos^ v) 

Ql = a^ cosh^ u sin^ v 

The geometry of the system may be inferred from Fig. 3 by suitable inter- 
change of the X-, Y- and Z-axes. 

5.9. Elliptic Cylindrical Coordinates. — If (37) is again rewritten with 
x^ in place of and y^ in place of r^, the loci of these equations are cylindri- 
cal surfaces, whose elements are parallel to the Z-axis and perpendicular 
to the XF-plane. Their intersections with this plane are ellipses and 
hyperbolas. The coordinate surfaces are: (1) elliptic cylinders (u = 
const.); (2) hyperbolic cylinders {v = const.); (3) planes parallel to the 

^ Also called planetary ellipsoids. The figures of the earth and of the planet Jupiter 
are approximately of this form. 

® At the risk of some confusion, we have interchanged axes in this system and in 
some of the following ones so that the Z-axis is always the axis of revolution. 


(6-42) 

(5-t3) 
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XF-plane {z = const.). Proceeding as before, 

X ^ a cosh u cos v 

* 

y = a sinh w sin t; (5-44) 

z = z 

Qu ^ Ql ^ (sinh^ u + sin^ v ) ; Qf = 1 (5^5) 

The intersection of these cylinders with the XF-plane may also be inferred 
from Fig. 3. 



6.10. Conical Coordinates. — A further degenerate case of the system 
of sec. 5.6 arises when the orthogonal sets of surfaces are: (1) spheres with 
centers at the origin and radius u {u = const.); (2) cones with apexes at 
the origin and axes along the Z-axis {v = const.); (3) cones with apexes 
at the origin and axes along the F-axis {w = const.), their equations being 


z^ = V? 


4 . 



(6-46) 


^ 

— vP (? — 


0 


> 6* > w^. The projections of the surfaces on the XT-plane are 
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families of circles, ellipses and triangles. From (46), we find 




and from (4) 

Ql = 1; Qv 


- b^)(w^ - b^) 

62(62 - c2) 

w2(t;2 — c^)(w^ — c^) 
c2(c2 — 62) 


(6-47) 


U^{v^ — 1^2) w2(i;2 — w^) 

62) (c2 - t;2) • " (i/)2 _ _ ^2) 


(5-48) 


6.11. Confocal Paraboloidal Coordinates. — A system similar to that of 
sec. 5.6 has coordinate surfaces consisting of confocal families of: (1) 
elliptic paraboloids extending in the direction of the negative Z-axis 
(X = const.); (2) hyperbolic paraboloids (/x = const.); (3) elliptic parabo- 
loids extending along the positive Z-axis {v = const.). The equations 
for the surfaces are 


— X 

+ 

62 - X 



y^ 

— fi 


m-62 

0 

+ 

« 0 


+ 22 + X = 0 

+ 2z + M = 0 
— 22 — p =0 


V — a V 


(6^9) 


where — « <X<62<pi<a2<v<+oc. Proceeding as in the con- 
focal ellipsoidal sj^stem, we may write the cubic equation in g, 

x2 1/2 

~2 ^ 72 h 22 + g =0 (6-50) 

~ q b — q 

with three real roots, X, m, As q varies between — oo and + qo , the com- 
plete system of confocal surfaces (49) will be described. On clearing (60) 
of fractions and equating it to its identity, we have 

x^{b^ - q) + y^ia^ - g) + (22 + q){a^ - q)(b’‘ - q) 

= (g-X)(9-M)(g-«') =0 (5-51) 

Expressions for and may be obtained from (51) by setting q - and 
in turn; the result for 2 is found by equating the coefficients of g* on 
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both sides of (61). We thus have 

2 _ - M)(a’^ - v) 

"" “ (b^-a^) ' 

, (62-X)(62-^)(b2- v) 

^ " (a2-b2) 

=*= — X — M — J') 

and 

2 ^ 1 (fl - \)(v - X) 
4(a2-X)(52-X) 

02 _ 1 (v - m) (X - m) 
n2 _ 1 (X - y) (m - y) 

4(a2 - p)(b^ - 


(6^62) 


(5-53) 


Because of the appearance of x and y as squares in (52), a point Pix^y^z) 
corresponds to four points P(X,/i,i^) symmetrically located with respect to 
the XZ- and FZ-planes. As in the confocal ellipsoidal system (sec. 5.6) 
the ambiguity may be removed by the use® of elliptic integrals. 

6.12. Parabolic Coordinates. — If two roots of (50) become equal, the 
preceding method fails since there are now only two surfaces. In this case, 
consider the families of parabolas 


a;2 = 2e{z + e/2) 

X^ = —2e{z — rp/2) 


(5-54) 


The vertices of all parabolas lie on the Z-axis at distances —^12 and 
t 7 ^/ 2 , respectively, and all of them have a common focus at the origin of the 
Cartesian coordinate system. If we now rotate these parabolas about the 
Z-axis, the resulting intersections are circles and the paraboloids of revolu- 
tion are still given by (54) if we replace by + 2 /^, x = r cos </>, 

7= r sin (p. We thus obtain 

X = cos <f> 

2 / = sin 0 

e)/2 

and from (4), 

= Q? = («=* + 1J*) 

Ql = 

® See, for example, Maxwell, J. C., “ A Treatise on Electricity and Magnetism,” 
Vol. I, Third Edition, Oxford Press, 1904, p. 240. Application of this system is ato 
described there. 


Q\ 


(5-55) 

(5-66) 
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The coordinate surfaces are: (1) paraboloids of revolution extending in the 
direction of the positive Z-axis ({ = const.); (2) paraboloids of revolution 
extending toward the negative Z-direction (17 = const.); (3) planes through 
the Z-axis (<^ = const.). Intersections of these surfaces with the XZ- 
and XT-planes are shown in Fig. 4. Parabolic coordinates have been 
used in the treatment of the Stark effect.*® 



Fig. 6-4 


6.13. Parabolic Cylindrical Coordinates . — A system similar to elliptical 
cylindrical coordinates is obtained by adding planes to the parabolic 
cylinders represented by (54). If we replace 2 by y in those equations, 
we have 


a; = It? 

y = e)/2 

z = z 

(5-57) 

el = el = + v^) 

el = 1 

(5-58) 


SchrSdinger, E., Ann. Physik 80, 467 (1926); Epstein, P. S., Phys. Rev 28 695 
(1926). 
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The coordinate surfaces are: (1) parabolic cylinders ({ = const.); (2) 
parabolic cylinders (r? = const.); (3) planes (z = const.). The intersec- 
tion of these surfaces with the XF-plane is like the system of confocal 
parabolas shown in Fig. 4. 

6.14. Bipolar Coordinates. — Before considering this system, we list a 
few relations which are needed in the subsequent discussion. In terms of 
exponentials, we may write 


sin X = ^ (e 
sin X 

tan X = = 

cos X 


- e^"®) ; cos X 

i{l - 

(1 + 


= + e'*) 


(5-59) 


Replacing x by ix, we have the corresponding hyperbolic functions 


sin ix = - (c* — e ®) = 2 sinh x 

cos ix = + e~^) - cosh x 

2 ( 6 -^- 1 ) 


(5-60) 


tan ix = 


+ 1 ) 


2 tanh X 



We shall also need the inverse circular function tan”^ x 
X = tan Uy it follows from (59) that 




(i - . 

(i + x)’ 


2iu - In 


(i - x) 
(i + x) 


u. Since 




tan ^ X = -In 
2 


(i + x) 
{i - x) 


and 


( 3 - 61 ) 
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Suppose a point P{x,y) is located as shown in Fig. 5 by means of two 
vectors ri and fa and two angles and $ 2 - For different positions of the 
point in the XF-plane, the vectors are always drawn from the fixed points 
A and B symmetrically located on the X-axis a distance 2o apart. If 
p = X + iy; p* = X — iy, then 

X = (p* -I- p)/2; y=^(p*-p) (6-62) 


The coordinates of the point are 


p - a 
p + a 


fie 

r2e' 


(6-63) 


and from the geometry of Fig. 5, it follows that 

ri = (x - -f y^; Si = tan"^ y/{x - a) 
ro = (x + aY + O 2 *= tan“‘ y/(x + a) 

Defining new quantities 


(6-64) 


i = 61 - 62 ; 7 ? = In- (8-66) 

n 

and dividing the two equations of (63) by each other 


where 


P + a _ -tx. E ^ e \ 

p — a ^ 'a — 1 


X = ? + 


(6-66) 

(5-67) 


In order to find x and y as functions of { and r/, substitute (66) and (67) 
in (62). When use is made of (59) and (60) the results are 


a sinh 7 ] 

X - — 

cosh T? — cos { 
a sin { 

y 

cosh rj — cos I 


(6-68) 


To find the form of the coordinate surfaces, we start from the definition 
of { and use (61) to obtain 


{ = In 


(ix — ia + y) (ix + ia — y) 
(ix — ia — y) {ix + ia + y) 


which may also be written as 

II 4- 
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We observe from (59) that the last term of this expression equals 
— 2a2//tan f « — 2a2/ cot Hence 

-- 2ay cot f = 0 
or 

+ {y- a cot = a\l + cot^ {) = csc^ { (6-69) 

In the same way we find 

2 , ^ ^ _ (x + a)^ + y^ 

^ rl (x - aY + y^ 

and 

(x — a coth 7j)^ + y^ = csch^ rj (5-70) 

We thus see that for ^ = const., 0 ^ ^ 27r, we have a family of circles 

with centers on the F-axis at the point, x = 0, i/ = a cot the radii of the 

circles being a esc Each member of this family will pass through the 



Fig. 5-6 


fixed points A and B as shown in Fig. 6 and will intersect the circles 
Yi * const, orthogonally. The members of the second family have radii 
of length a csch rj and are all situated on the X-axis at the points 
X = a coth Tj, 1/ = 0. The point A is obtained when ry = + oo and B when 
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* — 00 . When r? = 0, the circles degenerate into points on the F-axis. 
The position of a point in the XF-plane is thus fixed when we know in 
which quadrant it lies and furthermore the constant values of t?, f of the 
circles which pass through it. Since the fixed points A and B (that is, the 
X-axis) divide each circle of the set ^ = const, into two segments, we arbi- 
trarily take f = fo < ^ for the arc above the X-axis and { = fo + ^ for all 
points below this axis. 

In order to use these circles as a coordinate system in space, imagine 
them to be moved along the Z-axis. Then (69) and (70) represent two 
families of right circular cylinders with axes parallel to the Z-axis. Suit- 
able coordinate surfaces are then: (1) cylinders with centers on the F-axis 
({ = const.); (2) cylinders with centers on the X-axis {rj = const.); (3) 
planes perpendicular to the Z-axis (z = const.). From (68) and (4), 


= 


(cosh — cos f)^ 


Qf = 1 


(5-71) 


Bipolar coordinates are useful^ ^ in problems of hydrodynamics and elec- 
tricity. 

6.16. Toroidal Coordinates. — If we rewrite (69) with substituted 
for and for the resulting equations 


2az cot { = 

4a^r^ coth^ rj = {r^ + z^ 


(5-72) 


represent the families of spheres and tores (or anchor rings) obtained by 
rotating the circles of the previous system about the Z-axis. If we take as 
the third surface planes through the Z-axis, ^ = const., then 

y/x = tan ^ (5-73) 

The orthogonal coordinate surfaces are thus: (1) spheres with centers on 
the axis of revolution at distances ita cot ? from the origin and radii, a esc ^ 
(( = const.); (2) anchor rings or tores, whose axial circles have radii 
a coth rj and whose cross-sections are circles of radii a csch rj (rj = const.); 
(3) planes through the Z-axis (^ = const.). The spheres and anchor rings 
have a common circle, r = a, z = 0. With methods similar to those used 

^^See, for example, Milne-Thomson (loc. cit.); Maxwell (loc. cit.); Jeans, J. H:, 
The Mathematical Theory of Electricity and Magnetism,’* Fifth Edition, Cambridge 
Press, 1925. 
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in sec. 6.14, 


X = r cos y = r sin ^ 
a sinh r; 
cosh 7? — cos { 
a sin { 

cosh ri — cos { 


( 6 - 74 ) 


Q? - Q,^ = 


(cosh ry — cos 


sinh^ 

(cosh ry — cos 


(5-75) 


This system has found application^^ in certain problems of electricity and 
of potential theory. 

Problem a. Show that in spherical polar coordinates: 

V • V = -y4— 7 jsin e ^ (r^Vr) + r ^ (gin $Vg) + r 

Sin 0 I or od d<f> } 




r sin d 1.60 
1 \dVr 


(V X V), 


(V X V), gin ^ I 3^ 
(VXV)« = i{£(rF«) 


— (sin dV^ 


d{rV^) 


Problem b. Show that in cylindrical coordinates: 


= ;{s('a 


p d4>^ ^ dz^ 


Problem c. If V is the potential energy and m is the mass of a particle show that 
Newton’s laws of motion become: 

(1) in spherical polar coordinates 

m{r — rO^ — r 8in*0^2( - —bV /dr 


' i”T — r Sin 0 COB Qif^) 

lrd< r be 

f 1 ^ 1 dV 

n — : — T (r^sin2 0^)> 

\r sin ddi J r sin 0 bip 


See Hobson, loc, cit., for reterences. 
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(2) in cylindrical coordinates 

m(p — 


dp 


m 



(pV) 


1 


\dV^ 

p dp 


mz = 


dz 


NON-ORTHOGONAL COORDINATE SYSTEMS 


6.16. Tensor Relations in Curvilinear Coordinates. — When the coordi- 
nate surfaces of a curvilinear system are not orthogonal, the methods of 
tensor analysis prove convenient (see sec. 4.20 ff.). The relations which 
we are about to derive are more general than those obtained in the first 
part of this chapter; in fact, we will show that the two formulations of 
the problem become equivalent for orthogonal coordinates. 

Let be the usual Cartesian coordinates of a point and 

be its curvilinear coordinates, as discussed in sec. 5.1. Then in tensor 
notation, eq. (3) becomes 


where 


ds^ = gtjdq^dq^ 


9ij 


dx^ dx^^ 
dq* dq’ 


= Qji 


is identical with Qij of eq. (4). The line element is clearly 
dsi = {i not summed) 


(6-3a) 

(6-4a) 

(5-6a) 


In order to find the surface element, we recall from sec. 4.5 that a sur- 
face may be represented as the vector product of two other vectors. Thus 
let ds 2 be an infinitesimal displacement at the point (gSg“,g^) along the 
coordinate line q^ and ds^ be a similar displacement along the fine q^. 
Then the vector dSi = dS 2 X ds^ is perpendicular to the plane q^ = const, 
and its magnitude dSi is the desired surface element in that plane. Before 
we can obtain the appropriate expression for it in terms of the tensor gij 
we must digress in order to consider two important systems of vectors in 
curvilinear coordinates. Suppose r = r(g^g^,g^) is a vector and 

dr = — 1 dg^ + + — 5 dg® 

dg dg^ dq^ 

We generally use the summation convention throughout the rest of this chapter. 
In certain cases, repeated indices are not to be summed; such exceptions should be 
obvious to the reader. 
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is a small displacement. If we define three vectors 

dr 


e, = 


dq^ 


then it is clear that we may write 

dr = eidq* 


(5-76) 


These vectors, e, which we call base vectors , are directed tangentially 
along the coordinate curves but they are not necessarily of unit length. 
While it is usually more convenient to resolve an arbitrary vector A into 
components which are multiples of a unit vector we may also write 

A = (5-77) 

and the three scalars are the contravariant components of A. Let us 
define another set of base vectors 


1 02 X ea 

= 


2 03 X ei 

e = 


q 01 X 02 

^ 


(5-78) 


where v is the scalar triple product [ 010203 ] of sec. 4.6b. These vectors are 
perpendicular to the planes of © 2 , 03; 03, 0i and Ci, 62, respectively, and it 
is easily seen that 

0"* • 0n = C 

Furthermore, it is true that 

X 0 ^ 0 ^ X 0 ^ 

01 = 7 ; 02 = 7 ; 

V V 

0 ^ X 0^ 

ea = 7 — where v' = [e^e^e^] and vv' - 1 ; 


(5-79) 


(5-80) 


hence the two sets of vectors e"^ and ©n are said to be reciprocal to each 
other. In terms of the reciprocal set^^ (76) becomes 

dr = (&-81) 

and (77) becomes 

A = a»e' (6-82) 

where the a* are the covariant components of A. 

The systems of base vectors introduced here are treated by matrix methods in 
sec. 10.10. 

Many interesting properties of reciprocal systems are presented by Gibbs- Wilson, 
loc. cit., pp. 81-92. 
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If we equate (76) and (81) we obtain 

dr = eidq' = e’dqj (6-83) 

and if we multiply by e* or ej we find, because of (79), that 

dq* = e* • e^dqj] dqj = e, • eidg' (5-84) 

Since the square of the distance between two points is given by ds^ = dr • dr 
we see from (83) that 

ds^ = e* • ejdq^dq’ = e' • e^dqidqj (5-85) 

We may therefore identify the scalar products of the base vectors with the 
tensors and Qij 

Qa = e» • e,*; = e' • e^‘ (5-86) 

For later use, we also note that we may equate (77) to (82) 

A = a»e‘ = a^e; (5-87) 

and use (79) and (86) to write 

(1% = gijd^; == g^^cii (5-88) 

We also have from (87) the equivalent expressions 
a,- = A • e^; a’ = A • e^ 
hence (87) may be stated in the alternative form 

A = (A • ei)e' = (A • eOey (5-89) 

We now have several relations by means of which we may find either the 
contravariant or covariant components of an arbitrary vector A. If we 

wish to know the components in terms of unit vectors, tangent to the q'' 

coordinate lines, we recall the equation defining the length of a vector 
(sec. 4.4) and see that the appropriate unit vectors are 


et e. 



Therefore, any vector A may also be written as 


where ‘ 


A = AiUi 
Ai — 


If needed, similar equations could be given in the reciprocal system. 

Let us now return to the problem of the surface element in curvilinear 
coordinates. Since dSi — e^dg*, we have 

dSi = dS2 X dS3 = (ea X ez)dq^dq^ 
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and 

dSi = [(62 X eg) • (62 X 

It is easy to show (see Problem a, sec. 4.6) that the scalar product inside 
the brackets becomes 

(62 • 62) (63 • 63) - (62 • 63X63 • 62) 

Thus when we use (86) we obtain 

dSi = V^fl'22933 - giidq^dq^ (5-6a) 

Similarly, surface elements on the planes = const, and q^ = const, are 
dS 2 = ^ 9 u 933 - d 3 dq^dq^ 

dSs = '^ 9 n 922 - ^ 2 dq^dq^ 

The volume element, 

dr = dsi • ds2 X dss = leie2es]dq^dq^dq^ 

If we place A = 62 X 63 and use (89) we get 

A = 62 X 63 = [6*0263)61 + [6^6263)62 + [6^6263)63 

Now by means of (4-18) and (78) we eliminate e' to obtain 


[ 616263 ] = 6 i • A = 


{(62 X 63 • 62 X 63)61 


[616263] 

+ (eg X 61 • 62 X 63)62 + (Ci X 62 • 62 X 63 ) 63 1 


Finally we expand the scalar products within the brackets using again the 
result of Problem a, sec. 4.6 and getting 

[616263]^ = 61 • ex[(e2 • 62) (63 • 63) - (62 • 63) (63 • 62)] 

+ 61 • 62[(62 • 63) (63 • 61) — (62 • 61) (63 • 63)] 

+ ei • 63[(62 • ei)(e 3 • 62) — (62 • 621(63 • ei)] 


By means of (86) we may replace the scalar products in this equation by 
the gij, finding that [616263] = where g is the determinant of the com- 
ponents of gij and the volume element becomes 

dr = s/g dq^dq^dq^ (5-7a) 


6.17. The Differential Operators in Tensor Notation. — We have seen 
in sec. 4.20 that the components of the gradient of a scalar point function (p 
are d<f>fdq*. The direction of dq* is determined by the vector ds* = e,dg‘ = 
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dq'/e' or, since e* = we have in curvilinear coordinates 




(6-9a) 


The divergence of a vector V in terms of its contravariant components is 
the covariant derivative. Thus, from (4-71) 


Qyi 

V ■ V = . = ^ + 

dq 


Now according to (4.63) 


but 


\iii\ = ^a'’‘ I— + ^ ^ 


,k ^9]k _ ki ^9ii 
^ dq' dq’' 


(&- 90 ) 


since we may exchange the dummy indices i and k. Moreover, 
and gij = gji so we may cancel the second and third terms in 
\ijji] . Finally, we refer to (4-59) and the rule for differentiating determi- 
nants (see sec. 10.4) to prove that 

^ = O'’ = gg'’ 

^Qij 


The Christoflfel symbol therefore takes the form 

ik ^ ^ J. ^ ^ _L ^^^g) 

dq’ ~ 2g dq’ ~ Vg dq’ 


Eq. (90) may thus be written as 


_ 3(Vg) 

dq' Vg dq’ 


j_ _a 

Vg dq' 


:i [V'^g] 


(6-20a) 


A similar expression may bt; obtained in terms of covariant components 
of V. 

If V = V<#>, the contravariant components of the gradient are 
= V • e' by (88) and by (9a) 


y‘ 


* fcj 


= g 


dq’ 
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Substituting this result in (20a) we find for the Laplacian 

Vg dq ^ dgO 


(5-21a) 


F,y = 




The final expression we wish to derive here is the curl. Define the 
covariant tensor of rank two 

dVj _ 
dq^ dq' 

If we transform to a new coordinate system we see from eq. (4-53) that 

This tensor which is invariant to such a transformation is the curl in 
curvilinear coordinates. According to its definition, it is skew symmetric, 
hence the only non-vanishing components are ^12, T23 and V^i. In terms 
of the base vectors we write 

V X V = Ti2(e' X e2) + 723(6^ x e^) + X 

We have shown in (78) how to convert the e' into the reciprocal base 
vectors and we have also proved that v = [616263] = Vg. With these 
changes, the curl of a vector V is* 

_,XV + 

■s/glX^q dq] agV 


+ 


(5-22a) 


62 

,11/^ uii ) ag j 

__ ^i| 

dq^] 

It is a simple matter to see what happens when the coordinate surfaces 
are mutually orthogonal. In that case, the vectors 61, 62, 63 are also 
perpendicular to each other and e" is parallel to e^. Moreover, 




Qii 

and 61 * 62 = 62 • 63 = 63 • 61 = 0. It thus follows that Qij = 0 unless 
i = j ; in the latter case, 

gu = W 

Remembering that gu is then identical with qJ as used in the first parts 
of this chapter, equations such as (3a), (4a), etc., in secs. 5.16 and 
5.17 will reduce to the corresponding equations which appeared earlier 
in this chapter without the letter a. 

Problem. Derive by the tensor method the results of Problems a, b, c, of sec. 5 . 15 . 
* Note that this tensor differs in sign from the conventional curl of vector analysis. 



CHAPTER 6 

CALCULUS OF VARIATIONS 


One of the elementary problems of the differential calculus is to find 
the maxima and minima, that is, the stationary values, of a function y{x). 
The necessary condition for the occurrence of a stationary value at x = a 
is that y\a) = 0. Sufficient conditions that it shall be a minimum or a 
maximum are, respectively, y^' {a) > 0 and y\a) < 0. The calculus of 
variations deals with a similar, but a more complicated problem, that of 
finding a function y{x) such that a definite integral,*taken over a function 
of this function, shall be a maximum or a minimum. The simpler parts of 
this calculus, to which this chapter will be primarily devoted, deal with 
the necessary conditions that the integral shall be either a maximum or a 
minimum; in other words, that it shall have a stationary value; sufficiency 
considerations as well as criteria for establishing the maximum or minimum 
character of the solutions are not important in many physical applications. 
For these, the reader should consult more comprehensive treatises on the 
subject.^ 

6.1. Single Independent and Single Dependent Variable. — Let it be 
desired, then, to find that function y{x) which will cause the integral 



I{^^y,yx)dx 


to have a stationary value. The integrand I is taken to be a function of 
the dependent variable y as well as the independent variable x and 
= dy/dx. The limits Xi and X2 are fixed and at each of them, y has a 
fixed value. The integral over I takes on different values along different 
paths connecting the points (xi,^i) and (x2,7/2); one of these paths is 
labeled Y{x) in Fig. 1. We assume that it is either largest or smallest 
along 2 /(x), for example. The paths Y (x) which are admitted for compari- 
son shall be adjacent paths covering a small neighborhood of the 
stationary path y(x), that is, Y (x) — y{x) shall be infinitesimal for all values 
of X between Xi and X 2 . 

We define: 

6y{x) = F(a;) - yix) (6-1) 

57 = I{x,Y,Y,) - I{x,y,y^) (6-2) 


^See, for instance, Bolza, O., “ Vorlesungen uber Variationsrechnung Bliss, 
G, A., '' Calculus of Variations,*’ Chicago, 1925. 
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The symbol d is called variation; it represents the increase in the 
quantity to which it is applied as we pass from the stationary path to the 
comparison path at the same value of x. Thus, clearly bx = 0. Further- 


more, 


dx dx 


dx 




(6-3) 



This shows that the symbols 6 and d/dx ** commute/^ Since Y and y are 
adjacent, it follows from (2) that 

bl = I{x, y + by, I/a, + byx) - I{x,y,yx) 

= + ^ ^Vx (6-4) 

dy dyx 


In words, the formal rules for computing variations are the same as those 
for computing differentials. 


In terms of this notation, the condition that 



Idx be stationary is 


easily written down. It is simply that the integraf along y shall yield the 
same value as that along y + by, 



(6-5) 


This is of course the analogue of the condition in the ordinary calculus 
that y{x) be stationary, i.e., dy = 0. With the use of (3) and (4), eq. (5) 
becomes 



= 0 
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The second term of the integrand yields after partial integration 



bydx + 



But the integrated part vanishes at both limits because hyi = by 2 = 0. 
Hence the stationarity condition becomes 




( 6 - 6 ) 


While the vanishing of an integral does not in general imply that the inte- 
grand is zero, we may nevertheless conclude here that 


dl d dl 
dy dx dyx 


(6-7) 


This is because the parenthesis in (6) is multiplied by an arbitrary though 
infinitesimally small function of Xy namely by. For if the left-hand side of 
(7) were not zero for every x, it would have to be positive in some range and 
negative in another range in order to satisfy (6) with a positive by. We 
may then choose by to be positive where the left side of (7) is positive and 
negative elsewhere, an arrangement which would violate (6). Hence 
eq. (7) follows and is the condition we are seeking. A function y which 
satisfies that differential equation is called an extremal. Among these 
extremals the minimizing or maximizing curve y will be found, provided 
it exists. 

Eq. (7) was first derived by Euler; it is called the Euler equation 
associated with the variation problem. It may be written in a different 
form: 



which is useful when 1 does not depend explicitly on x, for then (7a) shows 
that 


/ - 



= const. 


represents an extreirial. 
by noting: 


The identity of (7) and (7a) is at once established 


dl dl dl , dl 

dx dx dy dyx 


Examples. 

a. Geodesics. It is usually taken for granted that a straight line is the 
shortest distance between two points in a plane. The calculus of variations 
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provides a formal proof of this assertion. The element of distance in 
Cartesian coordinates is given by ds^ = dx^ + dy^. Hence 


S = J ^ ^ f + uiy'^dx 


' Xi ^ XI 

If this is to be a minimum, Euler^s equation (7), with / = (1 + 
must be satisfied. Hence 


or] 


Vx 


^i + yl 


= const. 


which means dy/dx == const. 

The minimizing curve is the straight line passing through the points 
yi and t/ 2 * Had we chosen polar coordinates, the problem would have 
been to find r as a function of <p such that 


s 



(dr^ + = 


J {r^ + rvY'^dip 


is stationary. The Euler equation then reads 

T d T(p 

(r2 + 4)''" “ d^p ° ° 

This reduces to 

-2rl-j^ 

(r2 + 4)3/2 

The expression on the left is simply the curvature of the curve in polar 
coordinates; hence the result is the same as before. 

The element of distance on the surface of a sphere of radius a is given by 

da = a{d6^ + sin^ 6d<p^y^^ 


If we wish to*find v? as a function of d such that s is stationary, we must solve 
(7) with / = (1 + sin^ 

^ r sin^ d(pe 1 
d^L(l + sin^ 

When the bracket is put equal to a constant, c, we get 

c cosec^ 6 
(1 — cot^ 

and on integrating 

V? * a — sin“^ (k cot S) 
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a and k being new constants. To interpret, this result we write it in Car- 
tesian coordinates, using z = a cos 9. We have ak cot 9 = a sm (a — v?), 
or, on multiplying by sin 9, 

kz = X sin a — 1 / cos a 


This represents a plane passing through the origin and hence cutting the 
surface of the sphere in a great circle. The shortest (and also the longest) 
distance between two points on the surface of the sphere is the arc of the 
great circle connecting them! 

b. The Brachistochrone, 

A problem which held the fascination of mathematicians for several 
decades of the 17th and 18th centuries is that of finding the path on which 
an object, in the absence of friction, will slide from one given point to 
another in the shortest (brachLstos) time (chronos). John Bernoulli 
proposed the problem in 1696; both he and his brother James, and also 
Newton and Leibnitz, found the correct solution. The path, which 
happens to be a cycloid, is known as the brachistochrone. 

Let the particle start from rest at the origin; the terminal point of the 
motion is (x 2 ^ 2 )- In working this problem it is convenient to extend the 
F-axis to the right Vnd to measure x downward. Then from the principle 
of conservation of energy, 

= mgx 


where v is'the velocity of the particle at any point of its path, m its mass and 
g the acceleration of gravity. Hence, since 


ds + dy^ 


(1 + vl) 
{2gxyf^ 


The integral to be minimized is therefore 


- X ’(^7' * 

Euler’s equation reads 


Hence 


A. l/x 


= 0 


*(i + d) 



c, Vx = 
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If we introduce the constant 2a = 1/c, integration leads to 

y ^ a vers“^ ~ — (2ax — (6-8) 

a 

But the new constant of integration, c, must be zero in order to make y 
vanish at a; = 0. Eq. (8) represents the equation of an inverted cycloid 
with its base along Y and its cusp at the origin. (Cf. Fig. 2.) The con- 
stant a must be so adjusted that the cycloid passes through the point 
The path will also be a cycloid if we allow the particle to fall with 
a finite initial velocity, as the reader may verify. 


Y 



c. Minimum 'Surface of Revolution, 

The soap film problem discussed in sec. 2.2i may also be solved by the 
method outlined above. Whatever the function the surface generated 
by revolving y about the X-axis has an area 

27r ^ yds = 2ir /v(l 

If this is to be a minimum, eq. (7a) requires that 

2/(1 + yiY'^ - yyiii + yiT^'^ = = « 

(5 + 

an expression which is identical with our former solution of the soap film 
problem. 

Problem. Solve Example c with the use of equation (7). 

6.2. Several Dependent Variables. — The foregoing simple considera- 
tions may be generalized in several obvious ways. In the present section 
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we shall suppose that the integrand I occurring in the integral to be 
minimized or maximized is a function of one independent but several 
dependent variables. In almost all examples relevant to this situation the 
independent variable is the time, while the coordinates are dependent. 
In view of this fact we shall modify our notation, using t in place of the 
former x, and x, y, 2, etc., in place of the former y. We wish to find the 
functions x{t)j y{t)^ z{t)^ • • • which make the integral 

j l{t,x,y,z, • • • xtyytyZi, • —) dt 
h 


stationary. The Euler condition is desired as before; we must require that 


But in this case 



(6-9) 


dl dl dl 

61 =-6x + -5i/ + ^52 + 

dx dy dz 


dl dl dl 

4- — 5xe + •— 6yt + — 6zt + 

dxt dyt dzt 


In computing the integral (9) we again perform partial integrations in the 
second group of terms; for example: 


f'^dx4t = f 

Jk dXt J u 


dXt dt 


. X, fd/ T r^d/di\ 

{8x)dt = — 6x — I ( — ) 6xdt 

^dxt J/j Jt^ dt \dxt/ 


As before, 6x vanishes at both limits. Hence (9) becomes 
Jt, L dl dxt) \dy dt dyj 


+ 


/dl d dl\ 1 

\jz~ ^ 


If 5x, 8yy 8z are entirely arbitrary and independent functions of t, each of 
the parentheses occurring here must vanish separately. Hence we obtain, 
in place of the one Euler equation (7), as many as there are dependent 
variables: 


dl d dl Q ^ 

dz dt dzt ^ dy dt dyt 

^ _d^ ^ 0- 
dz dt dZt ' 


(6-10) 


6.3. Example : Hamilton’s Principle. — ^The elementary formulation of 
the laws of mechanics is Newton's; it involves in an essential manner the 
concept of force. Numerous other formulations based on different funda- 



6.3 


CALCULUS OF VAKIATIONS 


200 


mental ideas, particularly the energy concept, have been proposed through- 
out the history of the subject. The most important of these is Hamilton’s 
principle. It should be regarded not as a consequence of Newton’s laws 
of force (although it can be shown to be consistent with them) but as a 
parallel fundamental postulate of mechanics which may be useful in cases 
where Newton’s laws are cumbersome in their application. The principle 
takes for granted a knowledge of the kinetic energy, T, of the mechanical 
system as a function of the coordinates and their derivatives, and also of 
the potential energy, F, as a function of coordinates and possibly the time. 
From the functional form of T and V it then permits the deduction of the 
coordinates as functions of the time. 

The principle postulates that the integral 



shall have a stationary value. The integrand, T — F , is called the Lagran- 
gian function. We shall consider only conservative mechanical systems, 
that is, systems for which F is a function of the coordinates only. 

Let us first treat the motion of a simple mass point in three dimensions, 
using rectangular coordinates for its description. Then 


and, 


T = ^m{xj + yf + 2?) 
V = V{x,y,z) 


SO that 

/ = im(x^ + yt + ^) -V 


Eqs. (10) are then seen to be Newton’s laws of motion: 


d 

dt 


(mxt) 


dx ’ 


I <”*'■> 


dy ’ 


- (»„,) - - 


dz 


An advantage of Hamilton’s principle becomes apparent when the 
problem is such that another system of coordinates is more natural for its 
solution. In that case Newtdh’s laws require the transformation of the 
force components to the new coordinates, which is sometimes inconvenient, 
while the scalars T and F are more easily transformed. Thus consider the 
motion of a particle in a central field of force, that is, F = F(r). Using 
polar coordinates we have 


r = I (rf + r2^?), F = F(r) 
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Independent variables are r and Hence Euler's equations read 

d , , 2 dV 

d! - "-m - - 5;r 

(mr“v») = 0 


The first of these is the well known radial equation of the problem of 
planetary motion { — dV /dr = const. /r^); the term represents the 
centripetal force, which appears automatically in this theory. The 
second equation is K^ler's second law for it states that r'^{d(p/dt) = const. 
Its meaning is obvious when it is remembered that the area swept out by 
the radius vector is ^r^(d(p/dt). 

Turning now to the consideration of more complicated physical systems 
containing more than one mass point, we first introduce general coordi- 
nates, qiy q2y ^ 3 , • • * qni where n is the number of degrees of freedom. V will 
be a function of the q^s, but it will not depend on the qt. The kinetic 
energy, T, however, will be a function of the q^s as well as the q/s (except 
when Cartesian coordinates are used). 

Hamilton's principle then states that 


[T(qiq2’ • • qn,quq2r • • «n<) - ^(gi • • • qn)]dt = 0 


Eqs. (10) become^ 


^ i - I 2 

dqi dt dqi dqi ’ ’ ‘ 


( 6 - 11 ) 


These are the famous Lagrangian equations of motion, first derived by 
Lagrange (without the use of the calculus of variations). 

To illustrate their applicability and also the use of generalized coordi- 
nates we discuss one further example taken from the field of electricity. 
If g is the charge and i = qt the current in a simple circuit which has 
capacitance C and self-inductance L, its total energy at any instant may be 
shown to be 



It is clear from the foregoing remarks that the first of these two terms 
may be regarded as kinetic energy T, the second as potential energy V 
provided g is chosen as a generalized coordinate. The intuitive meaning 


* Here qt haa been written for dqjdt. 
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of T and V here becomes lost, as it does in many problems of advanced 
dynamics. Lagrange’s equation for the present case takes the form 




and this will be recognized as the differential equation describing the 
natural oscillations of an electrical circuit having no resistance. More 
complicated examples of the application of Lagrange’s equations to electri- 
cal and indeed even thermal phenomena are available.^ 


Problem. For a simple harmonic oscillator, V = 
to obtain its equation of motion. 


Use Hamilton's principle 


6.4. Several Independent Variables. — Next, it is necessary to extend 
the simple theory so as to permit the integrand to contain several inde- 
pendent variables. The problem then is to find a function u{x^yyZ) such 


that 



/ {Xyy,ZyUyUxyUy,Uz)dxdydz 


is stationary. Here we are treating XyPy z as independent variables, u as the 
one dependent variable, and we define again : Ux = du/dXy etc. As before, 
we require 


J J J bldxdydz = 0 


( 6 - 12 ) 


Here bu represents the increment incurred in the passage from the 
extremal u to some neighboring function t/, x, i/, and z being held fixed. 
Hence bx = by — bz = 0. Therefore 


5/ 


dl dl dl 

bu — bUx “I” — bUy T" 


du 


dUx 


dUy 


dUg 


bUg 


In evaluating an integral like J* J* J * ^ buxdxdydz we first perform the 


integration with respect to x, obtai^ng 

dl d ^ / n /di\ , 

I r- budx = 0 — / — ( 1 budx 

jpj BUx dx %/ dx \0Ux/ 

and (12) reads 

r r ^ ^ d di\ 

J J J \du dx dUx dy dUy dz dUg) ^ ^ ^ 


* See Thomson, J. J., “ Applications of Ebmamics to Physics and Chemistry," 
Macmillan Co., 1888. A less extensive account may be found in Lindsay and Margenau, 
** Foundations of Physics," John Wiley and Sons, 1936, pp. 188-212. 
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The corresponding Euler equation is therefore 

dl d dl d dl d dl ^ 

du dx dUx dy dUy dz dUg 


(6~13) 


If, in addition to u, there are other dependent variables y, w, etc., eq. (13) 
is augmented by other equations in which u is replaced by t;, ly, etc. 


Examples. 

a. Let us find the function u{x^y^z) which has a minimum average value 
of the square of its gradient in a certain region of space. Although this 
requirement seems ar^cial at first sight, it is nevertheless of considerable 
significance in electrostatic and quantum-mechanical problems. If 


J* j* f {VuYdxdydz 


is to be stationary, 1 = (cf. Chapter 4 for the definition of 

the operator V), and (13) becomes 

”1“ ^VV ”1” '^^zz ~ V t/- = 0 


This is Laplace’s equation which must be satisfied, for instance, by the 
electric potential in free space. (Cf. Chapter 7.) 

b. V ibrating String, Let a string of length I be under tension F. When 
it executes small vibrations, it suffers the displacement 7i(x) at right angles 
to its length, which will be taken along x. For any distortion, I changes to 
Z', and 

l' = f Vl -f uldx 

^0 


If the distortion is small, the integrand may be expanded to read 1 4- 
so that 



The potential energy of the entire string will then be 



provided the tension F is not changed by the small displacements u{x). 
The kinetic energy is, clearly, 
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if m represents the mass of the string per unit length, considered constant. 
Hamilton's principle now states: 

— ^Ful)dxdi 



shall be stationary. The two variables, x and t, are here to be regarded 
as the independent ones. The Euler equation (13) for this case is the 
wave equation : 

F 

Uti — Uxx 

m 

6.6. Accessory Conditions; Isoperimetric Problems. — Problems some- 
times arise in which it is necessary to make an integral stationary while at 
the same time one or more integrals involving the same variables are to be 
kept constant. A typical example, discussed below, is that of finding the 
closed plane curve of given perimeter and maximum area. This example, 
being one of the earliest to engage mathematical interest, has given this 
class of problems the name isoperimetric.^^ 

In general, the presence of accessory conditions can be dealt with by 
means of Lagrange^s method of undetermined multipliers. We wish to 
find the stationary value of 

fur 


provided that 





(6-14) 


All 7^s contain the same variables; the limits are fixed and identical in all 
integrations, and the integrations may be multiple; in the latter case dr 
stands for a product of differentials. The c’s are understood to be given 
constants. 

We introduce a set of n constant parameters, Xi, X 2 , • • • Xn, the values 
of which are not at once specified. It is clear that, if ( Ur is stationary, 



where if == / + Xi/i + X 2/2 + * • • + Xn/n, is also stationary whatever 
the values of the X^s, because of (14). We are thus confronted with a 
problem similar to the foregoing, the minimization (or maximization) of a 
single integral, but with a modified integrand: I must be replaced by 

K == I + £Xt/». If now the same steps are pursued in evaluating J* bKdr 
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as were outlined in sec. 6.1, we arrive at the equivalent of eq. (6) in which 
I is now replaced by K, But the passage from (6) to (7) is now obstructed 
because by is no longer an arbitrary function: the variations must be in 
accord with the relations (14). One may say that by has lost n degrees of 
freedom. But here the unspecified character of the X^s comes to our 
rescue. They are precisely n in number and can be so adjusted that the 
parentheses vanish.^ Hence the transition from eq. (6) to (7) is 
permitted in this case as well. The extremals must satisfy Euler^s 
equation 


dy dx dyx 


(6-15) 


or its equivalent (7a). If there are several dependent and independent 
variables, eqs. (10) and (13) take the place of (15). 

In solving Euler’s equation the X’s which are now presumably fixed but 
unknown appear as constants in the extremals. They may be eliminated 
formally by means of conditions (14), but their meaning can usually be 
recognized more directly at some stage of the solution. 


Examples. 


a. To find the plane curve of fixed perimeter and maximum area. We 
seek that r(v?) which maximizes 



p2w 

A = ^ f r^d<p 
*'0 

and has a fixed 

S = r""(r2 + 

^ 0 

Here 

so that (15) reads: 

K + X(r2 + 4)1/3 

r + Xr(r^ + r%) - — [\r^(r^ + r^) ^ q 

This leads to 

— 24 — ^ 1 

(r3 + 4)3/3 - 

The left of this equation will be recognized as the curvature, 1/p, of the 
curve. This is to be constant, hence the curve is a circle with radius 


p =X. 

^ For a more detailed discussion of Lagrange’s method of undetermined multipliers 
see Page, L., Introduction to Theoretical Physics,” Second Edition, D. Van Nostrand 
Co., New York, 1935, p. 311. 
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b. To prove that the sphere is the solid figure of revolution which, 
for a given surface area, has maximum volume. The area is 


the volume: 


Therefore 


A 



2t J yil + 



K + Xy(l + yiyi^ 


since we are here permitted to drop constant factor®. As K does not con- 
tain X explicitly, it is convenient to use (7a) instead of (7) or (15) : 


whence 


dK 

dx 


d 

dx 



^Vx) 


= 0 



= 2/^4- \y(l + - Xi/i/^(l + yl) = c 


But clearly, y = 0 at x =0 and at x = a, which can only be true if c = 0. 
Hence 


y^ + \y{l + yir^!^ = 0 


or 

y = -X(l + t/lr*'* 


Solving this for j/x we obtain 

dx 


Vx2- 

y 


which on integration leads to 

1/^ - X - Xo 

or, 

(x - Xo)^ + !/^ = 

We note that the figure is a sphere with center on the A-axis at Xq and of 
radius X. 

It is possible to work this problem without the use of Lagrangian 
multipliers by means of an ingenious method due to Euler. He uses in 
place of the independent variable x a new one, {, which measures essentially 
the area of revolution formed by the arc y{x) between x = 0 and the 
variable point x = x : 
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In terms of this variable, 


SO that 

^ f^d - vlyy^d^ 


( 6 - 16 ) 


Here b represents the value of J when x = a, that is, the given area divided 
by 2t. By keeping b fixed the accessory condition A = const, is auto- 
matically satisfied. This method, while very elegant, cannot be applied 
generally. 

The stationarity condition for (16), if written in the form (7b), yields 
yd - yh]y^ + y^Vid - = J/(i - y^y\)~^'^ = c (6-17) 


whence 


After integration, 






1/2 


The new constant d must be 1 if the curves are to pass through f = i/ = 0. 
To obtain the result in terms of x and t/, we substitute for in (17) the 
value obtained by solving 

Vt = t/(i + yif‘% 

Eq. (17) then reads 

?y(i 4- j/x)*'* = c 

and this is precisely the equation solved above with ~ c = X. 

c. Wave equation. In sec. 6.4 we have seen that Laplace^s equation is 
the necessary condition that the average of the square of the gradient of a 
function shall have a stationary value. If the same quantity is to be made 

stationary, but with the additional requirement that J u^dxdydz shall have 

a fixed value, another interesting equation results. In that case 
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The integral to be minimized is therefore y'(/ 4- \I\)dxdydz, Euler^s 
equation [in the form (13)] then reads 


d^u d^u d^u 
dx^ dy^ dz^ 


Xw = 0 


which is a special form of the wave equation, namely, that describing 
sinusoidal waves of a single frequency. (Cf. Chapter 7.) Such a wave 
may therefore be characterized as a disturbance in which the displacement 
u has a fixed mean square value and at the same time a minimum square 
gradient. 

6.6. Schrodinger Equation. — ^The fundamental equation of quantum 
mechanics (sec. 11.9) can be derived from a variation principle, as will 
now be shown. We define an operator, known as the Hamiltonian operator y 
as follows: 

H ^ + V(XyyyZ) 


The physical meaning of k is seen from the relation k = h^lSw^m where h is 
Planck^s constant and m the mass of the particle whose motion is con- 
sidered; V is its potential energy. We now seek a function possibly 
complex, which satisfies the following two conditions: 

f f f 4'*(H^)dxdydz (6-18a) 


shall be stationary; 


Iff 


\l/*\l/dxdydz = 1 


(6r-18b) 


The integrations are taken over fixed domains of x, t/, and z. It will be 
supposed, furthermore, that the permissible functions ^ and either 
vanish sufficiently strongly at the boundaries of the volume of integration, 
or take on the same values and derivatives at corresponding points on 
opposite boundaries. 

When this is true, the following transformation may be made: 




d\l/* d\l/ _ 
— • — dx 
dx dx 


The integrated part vanishes. As a consequence 


J f f 'l'*V^dxdydz = - f f f (Vf*) • V^pdxdydz 
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and condition (18a) may be modified to read: 


Iff 


+ V{x,yyZ)]dxdydz = 0 


The function K which appears in Euler^s equation [(15) but generalized in 
accordance with (13) to take care of the fact that there are now three 
independent variables] is 

K = mu, + + w - uu 

Euler^s equations are (yp* and \p are both dependent variables !) 


dK 

d dK 

1 

1 

d dK 

dtp 

dx dtpx 

dy dtpy 

dz dpi 

dK 

d dK 

d dK 

d dK 

dtp* 

dx dtp* 

~TydtP* 

dz dp* 


They reduce to 

-mxx + Pyy + ^'zz) + (6-19) 

and a similar equation for To identify the constant X, we note that 
eq. (19) may be written 

Hyp = \yp 


If we multiply this equation by and integrate over x, 2 /, z the left side 
becomes the stationary integral (18a), which will be denoted by E. The 
right is X in view of (18b). Hence \ = E, With this substitution for X, 
eq. (19) is Schrodinger^s equation. 

This result is worth summarizing. Schrodinger^s equation serves the 


purpose of selecting the extremals yp which make 


Iff 


stationary, provided J' J' J' ^*'Pdxdydz is held constant. If the latter 
constant is unity, then, J' J' J' 'P* {Hyl/)dxdydz is the energy which appears 


in the Schrodinger problem. Further inspection shows the energy to be a 
minimum rather than a maximum in most cases of physical interest. Upon 
these results is based one of the most powerful methods of obtaining 
approximate solutions of eq. (19). (Cf. sec. 11.18.) 

6.7. Concluding Remarks. — In concluding this chapter, we note a few 
possible generalizations of the theoiy given here. In the first place, one 
may remove the restriction that 5?/ = 0 at the limits of integration. This 
means, with reference to Fig. 1, that the curves y{x) and Y (x) do not have 
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the same termini. The integrated term which appears in the partial inte- 
gration leading to eq. (6) will then no longer vanish, and there arise three 
conditions in place of eq. (7): 

[iil =0; m .0 

dy dx dyx LdyxJxi LdyxJxt 

The second and third of these then serve to fix the arbitrary constants in 

the solution of Euler^s equation. 

A further generalization is needed when the limits xi and X 2 themselves 
are no longer fixed. Whenever this happens, introduction of a new 
parameter, in terms of which both x and y may be expressed, reduces the 
problem to the forms here discussed.^ The Principle of Least Action 

involves a variation problem with variable limits. Since Hamilton's 

principle is in general more powerful the former, in spite of its historical 
interest, will here be omitted. 

When the integrand I involves higher derivatives than the first, no 
great complications arise. The Euler equation then contains additional 
terms.® The point where our simple treatment has been most deficient is 
in its omission of all considerations establishing the actual existence of 
maximizing and minimizing curves. It will be recalled that Euler^s 
equations are merely necessary conditions. They furnish no assurance 
whatever that the curves sought are indeed present among the extremals. 
For these more mathematical questions we refer the reader to the treatises 
by Bolza, Bliss, and Kneser. 

^See Byerly, “ Introduction to the Calculus of Variations,” Harvard University 
Press, 1917. 

® See Kneser, A., ” Lehrbuch der Variationsrechnung,” Vieweg & Soho, Braun- 
schweig, 1925. 
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PARTIAL DIFFERENTIAL EQUATIONS OF CLASSICAL PHYSICS 

7.1/ General Considerations. — ^The general theory of partial differen- 
tial equations is well beyond the scope of this book and will not be developed 
in a systematic way.^ Attention will here be limited to a small number of 
partial differential equations which are of frequent occurrence, almost all 
of which may be resolved by a powerful method known as the separation 
of variables. Before we proceed to consider specific examples, however, a 
few remarks about the meaning and variety of the solutions are in order. 

The simplest type of an ordinary differential equation, that of the first 
order, has a general solution which contains one arbitrary constant; geo- 
metrically it may be interpreted as a set of plane curves labeled by different 
values of the arbitrary constant. In particular, if the equation is linear, 
there is but one curve passing through a given point, and this is uniquely 
specified when the value of y for some value of x is prescribed. 

The simplest type of partial differential equation is one with two 
independent variables (x and y), and the dependent variable ( 2 ), which is 
linear and of the first order. Its solutions represent, geometrically, a set 
of surfaces constructed over the X-Y plane. The question may be asked: 
Is one such surface uniquely determined by requiring that it include a given 
point? If this were true, the manifold of solutions of the differential 
equation, 2 (x,?/), would reduce to a single surface when it is specified that 
the solution shall contain that point. 

This, however, is not the case. For consider the simple equation 
dz/dx 4- dz/dy = 0. It is clear that any function of the form 2 : = (p{x — y) 
will satisfy it. This function is not uniquely determined by fixing one 
point of it. The origin, for instance, is contained in all surfaces 
z = c(x — y), and yet every different value of c defines a different surface. 

Neither does a prescribed curve fix a surface. For let it be required 
that the solution z of the partial equation above shall pass through the line 
X = y in the X-Y plane. This is certainly accomplished by taking 
2 = (x — but there is an infinite number of such surfaces depending 
on the parameter n. It is clear from these elementary considerations that 

’ A more complete discussion may be found in Frank, P. and v. Mises, R., Differ- 
entialgleichungen der Physik,” Braunschweig, 1930; ** Handbuch der Physik,’^ Vol. Ill, 
Partielle Differentialgleichungen (article by J. Lense); Courant, R. and Hilbert, D., 
** Die Methoden der Mathematischen Physik,” Vol. II, Berlin, 1937. 
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in dealing with the solutions of partial differential equations we are con- 
fronted with a variety of functions which far transcends the degree of 
complexity encountered in connection with ordinary differential equations. 
In fact one must not be surprised to find that the complete geometric 
specification of a solution of a partial equation even of the simplest type 
usually requires the fixation of an infinite number of parameters. 

*/7.2. Laplace’s Equation. — An equation which arises in almost all 
branches of analysis is Laplace^s: 

= 0 (7-1) 

Its intuitive meaning was discussed in the chapter on the calculus of varia- 
tion (sec. 6.4), where eq. (6-1) was shown to be equivalent to the postulate 
that V shall have the least mean gradient. The function V satisfying (1) 
may be said to be the “ smoothest ” of all functions. This is obvious when 
Laplace's equation is solved in one dimension, for then it simply reads: 
(fyidx^ = 0 and has as its solutions all straight lines. 

To indicate briefly the range of application of eq. (1) we state three 
instances in which it occurs: 

a. A fundamental theorem of function theory states: 

Let z = X + ii/; then the function /( 2 ) takes the form 

fiz) = u(x,y) + iv(x,y) 

wherein u, r, x, y are all real, if and only if the functions u and v satisfy: 

= 0 , Vh = 0 

b. In sec. 4.12 it was shown that the velocity v of an indestructible 
fluid, as a function of space coordinates and the time, must be a solution 
of the equation of continuity, which reads 

^ + V • (pv) = 0 

dt 

If the fluid is incompressible, its density p is constant, and the equation 
reads 

V • V = 0 

If, furthermore, the motion is irrotational, the velocity vector is the gradi- 
ent of a scalar function T, known as the velocity potential: v = — VT, 
and the equation of continuity thus becomes equivalent to Laplace's: 
V^V = 0 . 

c. The electrostatic potential in a region of space not occupied by 
charges satisfies Laplace's equation. 

Before discussing a partial differential equation of this general form one 
must realize, of course, that its solutions for different numbers of dimensions- 
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(independent variables) are quite different; moreover, that the form of 
the solution even for the same number of dimensions will be different in 
different systems of coordinates. 

^ 7.3. Laplace’s Equation in Two Dimensions. — a. Rectangular Coordi- 
nates, The equation reads: 


dx^ dy^ 


= 0 


(7-2) 


A method, not of universal applicability but suitable for this particular 
problem, involves the transformation to a new set of independent variables: 


^ = X + iyy ri X - iy 


In terms of these, 


dx^ + 2 + 


d^drj dr^ ' dy^ 


= “ T3 + 2 




dr 


so that 


d^V 

= 4 — = 0 
d^dv 


Clearly, this equation admits both V = f(0 and V = f{rj) as solutions, 
hence 

V = /i(?) + f2in) = /i(x + iy) +f2(x- iy) 


where /i and /2 are any two functions which are twice differentiable. The 
reader will hardly fail to see the connection between this result and the 
statement above concerning the functions of a complex variable. 

For many problems another form of solution, obtainable by the method 
of separation of variables, is more satisfactory. Let us make the assump- 
tion, justifiable by its success, that V may be written in the form 

F = X{x) . Y(y) (7-3) 

where X and Y are functions of only one independent variable, x and y, 
respectively. When (3) is substituted in (2) there results, after division 
by y, 

y" y" 

^+^-0 (7-4) 

an equation in which primes denote differentiation of a function with 
respect to its own variable. If (4) is to have a solution at all, then each 
term on the left must separately be equal to a constant; for a change in x 
would not alter the value of Y" /Y, and a change in y would not affect 
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X''IX. 


One may therefore conclude: 


X" 

X 



(7-5) 


where the constant parameter written in this form for convenience, 
may have any value, real or complex. These are two ordinary equations 
which may easily be solved by the methods of Chapter 2. Eq. (5) leads 
at once to 

X = Cie±**, Y = C2e±'*>' 


Hence a solution of (2), characterized by a given value of the parameter k, 
will be 


Vk = 


(7-6) 


Since (2) is a linear equation, a sum of expressions like (6) is also a solution. 
Hence a more general solution is 

V = Dcjfce**^***') 

k 


or even 




(7-6a) 


P or the value k - 0 the result is of a more special form, 
to 


so that 


X = aix + a2, Y = biy + 62 
V = axy cx + dy + e 


Eq. (5) then leads 


(7-7) 


Which of the solutions, (6), (6a), or (7), is to be chosen depends entirely on 
the nature of the problem at hand. (Cf. examples.) 

y b. Polar Coordinates. Laplace^s equation reads: 

1 dV 1 d^V 

Using again the method of separation of variables, we put 

V = P(p)^(^) 

When this is substituted into (8) there results, after multiplication by 

pW, 


p" p' 




+ p 


0 


(7-9) 
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Here the first two terms are independent of <(>, the third is independent of p. 
Hence we may write 


P' 




= -fc* 


The solution of the first equation is at once seen to be P = pr^^, that of the 
second, ^ Hence 

Vt = (7-10) 


or, more generally, 

V = (7-lOa) 

A; 

VoT k =0, (9) becomes 

P" + — = 0, = 0 

P 


When integrated once, the first of these yields = cp ~^ ; after another 
integration P = ai In p + a 2 - On the other hand, = 0 leads to 

4> = hi(p + 62 - Hence a particular solution is 

V = (ai In p + a 2 ){bi(p + 62 ) (7-11) 


Again, further information must be available before a special one of these 
results can be selected as a suitable solution of a given problem. 


\4. Laplace’s Equation in Three Dimensions.^ — a. Rectangular Coordi- 
nates, An application of the method of separation of variables to 


S2y ^2y 

+ dy^ ^ dz^ “ ^ 


follows precisely along the lines of sec. 7.3a. 
and obtain 

^ ^ 

X 7 z 


We put V = X(,x)Y(y)Z{z) 


0 


Each of these terms must separately equal a constant, and the sum of these 
constants (which we write as fcf , kly kl) must vanish. Thus 

Vk,k,k, = c*‘*+‘«’+***, fc? + kl + kl=0 (7-12) 

If fci, fc 2 , or ks is zero, the corresponding factor in (12) must be replaced by 
aix + 02 , etc. A more general solution would be 

V = I. (7-12a) 

^ A solution of Laplace's equation in three dimensions is often called an “ harmonic " 
function. 
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In this connection it is sometimes convenient to regard fci, k 2 , ks formally 
as the components of a vector k. Eq. (12) may then be written 


Fk= c(k)c'‘', 1&1=0 

b. Cylindrical Coordinates. In accordance with the results of Chap- 
ters, 


V2y ^ 


d^v . 1 dV 1 a^F 


Put 


p dp p^ d<(>^ 

V = Fip)Z(z)^{<p) 


0 


substitute, and divide by F. The result is 


p p p 


^ z ~ 


Clearly, the last term on the left must be constant; let us put it equal to 
-P. Then 

Z = c,e±*‘* (7-13) 


The remaining equation, 
P" 


+ 


1 P' 1 4>' 


+ 


$ 


- fc* = 0 


when multiplied by p^, separates again into two equations: 

^ff 

— = -P, p2p" + pP' - (JkV + l^)P = 0 


The first has the solution 

the second turns into BesseVs differential equation (2-57) when the sub- 
stitution ikp = a; is made, for it then reads: 

^2p jp 

dx 

The solution of this equation was discussed in sec. 2.14. It will here be 
denoted by Zi, Collecting these results we have 

Vti = (7-14) 

When I = 0, 4> = ai(p + 0 . 2 ] hence we obtain as another solution of 
lesser generality than (14) the expression: 

= Cio 2 o(tfcp)e^*^*(ai^ H- 02) 


(7-14a) 
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When k —0,Z = (bit + 62) instead of the function (13). The equsr 
tion for P takes the form 

+ pP' - f2p = 0 

which was already encountered in sec. 3b. It has the solution r=*=*. Hence 

Voi = r^‘(biz + 62)6^*''" (7-14b) 

Finally, when both I and k are zero, the solution may be seen to take the 
form 


^00 = (fli In p + a2)(biz + 62) (ci^ + C2) 


(7-14c) 


The most general function satisfying Laplace^s equation is a superposition 
of solutions (14)-(14c). 

Polar (Spherical) Coordinates. As was shown in the chapter on 
coordinate systems (sec. 5.4), the equation = 0, when transformed to 
polar coordinates, reads: 


1 d / 2dF\ 1 d / dF\ , 1 

dr \ dr / sin B dd \ ^ 36 ) r^ sin^ 


= 0 (7-15) 


Multiplication by sin^ 6 will isolate the term d^V/d(p^ as the only one 
depending on <p from the remainder of the equation. If, therefore, we 
put it equal to — so that 

^ (7-16) 

(V being written as i?(r) -0(6) • 4>((p)), then eq. (15) takes the form 




0 do 


(sin $ 0 ^) — = 0 


When this is divided through by sin^ B the terms involving r are cleanly 
separated from those involving B. Hence we obtain 


2 __1_ ^ ^0') 

0 sm 6 dB 


+ c = 0 




(7-17) 


(7-18) 


where c denotes the same constant in both equations. It will prove con- 
venient to write this constant in the form c = i(/ + 1). Let us now make 
the substitution cos d = x in eq. (17), obtaining (after multiplication by 0) 
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This, however, is none other than the differential equation (cf. eq. 2-41) for 
associated spherical harmonics discussed in sec. 2.11. Special solutions 
were studied in sec. 3.6. They were written in the form 

e = PT{x) 

It must here be noted that these functions do not represent the general 
solution of eq. (19), but a particular one having the property of being finite 
for all values of x between —1 and +1, including these limits. In most 
physical problems this is a condition naturally to be imposed on the solu- 
tion of Laplace's equation; there are cases, however, in which a more 
general solution of (19) must be chosen. It was also found in sec. 2.11 
that the constant I must, for the sake of finiteness, be a positive integer. 
We shall restrict the present consideration to problems in which these 
conditions hold, and assume 

e = Pr (cos e) (7-20) 

This expression has no meaning unless m, also, is a positive integer. 
Again, the nature of most physical problems imposes this requirement. 
For if V represents the distribution of any physical quantity in space, it 
must obviously be periodic in <p and have a period of 27r, since otherwise 
V (<p) and V (27r + (p) would have different values although (p and 27r -f 
denote the same angle. But the function (16) does not possess this 
periodicity unless m is an integer. 

The function R is now easily obtained by solving (18) which reads on 
expansion: 

+ 2rP' - l{l 4- 1)P == 0 

Its solution is obviously of the form R = r“, and on substitution we find 

a{a - 1) -f 2a - <(/ -h 1) = 0 

so that a is either I or — (/ + 1). Hence 

R = air^ + (7-21) 

In view of (16), (20) and (21) we conclude that a solution of Laplace's 
equation in polar coordinates has the form 

Vim = (« 1 ^^ + (i 2 r~^~^)PT (cos (7-22) 

and the general solution will be a superposition' of any number of such 
functions. 

Other systems of coordinates in which the equation V^V =0 can be 
solved by the method of separation of variables are listed in Chapter 5. 
It is felt, however, that the foregoing special cases illustrate the procedure. 
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EXAMPLES OF SOLUTIONS OF LAPLACE^S EQUATION 

i.b. Sphere Moving through an Incompressible Fluid without Vortex 
Formation. — Since the motion of the liquid is irrotational, its velocity, v. 
at every point is the gradient of a scalar potential, F, which satisfies 
Laplace's equation. Thus 

V = ~VF, and V^V = 0 


Which of all the solutions derived above is to be chosen, depends entirely 
on the boundary conditions of the problem. These are, clearly : 

a. The radial velocity of the fluid at the surface of the sphere of radius Tq 
shall be equal to the velocity of the sphere times the cosine of the angle 
which r makes with the direction of motion of the sphere. Taking the 
latter as the polar axis, we have 

= Vo cos e (a) 

r=ro 


dV 

dr 


b. The distant portions of the liquid are not affected by the motion of 
the sphere. Hence 


dr 



CD 


*= 0 


(b) 


The form of these conditions at once prescribes the use of polar coordi* 
nates. The solution is, therefore, of the form (22). To satisfy (a) we 
must put the angular part of this expression equal to cos^; there is no 
dependence on ip at all. The only possible value of m which produces 
freedom from ip is zero, and of all the functions P? (cos 0), only P? (cos 0) 
is equal to cos Hence i = 1. Condition (a) now states: 


whence 


Jr 


(ttir + a2r 


cos 0 = i»o cos 9 

r=ro 


— Ui + 2a2ro ^ = pq 


But condition (b) cannot be satisfied unless ai - 0. Therefore 
2a2r^^ = Vo- Eq. (22) has thus been reduced to 


V = 


.. ..3 

VQ^O 

2r2 


cos 6 


This represents the velocity potential for the case in question. 

7.6. Simple Electrostatic Potentials. — As a matter of illustration, we 
consider the simplest electrostatic potentials from the point of view of 
Laplace's equation : that due to a charged conducting sphere, a uniformly 
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charged cylinder (wire) of infinite length, and a uniformly charged infinite 
plane. 

a. The boundary condition in the first case is obviously V = Vq, 
a constant, provided we write tq for the radius of the sphere. Since spheri- 
cal polar coordinates are used in describing this condition, the general 
solution of Laplace^s equation must be taken in the form (22). The con- 
dition also prescribes that V shall be independent of d and v?, for otherwise 
V could not be constant. Hence m and I must both be zero. We 

conclude, therefore, that V = ai + a 2 /r, and since ai + a 2 Ao = Vq, we 
find on eliminating a 2 : 



If we require in addition that V be zero at r = oo (which would be true if 
the potential were produced entirely by the charged sphere) the constant 

CLi = 0 . 

b. The boundary condition in the case of the cylinder reads 
V(pofZy(p) = Vof Po denoting the radius of the cylinder. Solution (14) is 
now relevant; but the observation that V is independent of (p and z leads 
at once to (14c) with 6i = Ci = 0. Hence we have F = Ui In p + 02 . 
On eliminating a 2 by means of the boundary condition we find 

F = Fo + ai In - 
Po 

The constant ai can be determined only when further facts, e.g., the charge 
density on the cylinder, are known. (In fact, ai = —2\ where X is linear 
charge density.) 

c. In the case of the charged plane we require V (a;,?/,0) = Fq, suppos- 
ing z = 0 to define the plane. This leads at once to a solution of the form 
(12), but with ki = k2 == 0; for otherwise F would depend on x and y. 
Since then ks must also vanish, 

F = (aix + a2)(biy + b2){ciZ + C 2 ) 

Again, to satisfy the boundary condition, oi = bi = 0, so that 

F = ci2 + Fo 

The constant ci can be eliminated when the charge density on the plane is 
known. (ci = —47r(T if <r is surface density of charge.) 

All these results could have been obtained much more simply by apply- 
ing Gauss^ law of electrostatics; our purpose here was to exhibit them as 
solutions of Laplace^s equation. 
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Problem. To find the potential produced when a conducting sphere is placed in an 
originally uniform field of strength J^o» extending along the Z-axis. Use as boundary 
conditions: 

V = 0 at r = ro (radius of sphere) 

V — —Eqz = —Eor cos ^ at r <» 

Ans. V = —Ef^rcoBd fl — 




Fig. 7-1 

7.7. Conducting Sphere in the Field of a Point Charge. — We wish to 
find the potential at P due to a point charge +g situated at 2 = a on the 
Z-axis when a conducting earthed sphere, distorting the field, is placed 
with its center at 0(cf. Fig. 1). Clearly, 

Vir,e) = ^+u 

s 

if U is the potential due to the induced charge on the sphere. Like the first 
term q/Sj U must be a solution of Laplace^s equation and may conven- 
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iently be written in the form (22). Here, however, it becomes necessary to 
retain full generality and use a superposition of harmonic functions: 

u -J: (cos 

l,m 

From the symmetry of the physical distribution about the Z-axis it is clear 
that U cannot depend on (p; hence m = 0. Also, since U must vanish at 
r = 00 , every a/ = 0. Hence 

U = (cos e) (7-23) 


The coefficients hi are to be determined by the condition that V shall be 
zero on the surface of the sphere: 


+ cos {6) = 0 

on sphere ^ 


The first term on the left can be expanded by means of a theorem proved 
in the discussion of the Legendre polynomials (eq. 3-24 et seq.) 


- = - i (-)' 

a 1^0 


Pi (cos 0) if a > r 


(7-24a) 


Hence the foregoing condition becomes: 

But this is satisfied only if the coefficient of every Pi is zero, so that 

hi = 


On substituting this back into (23) we find 



(cos 6) 


'(7-25) 


a result which permits a very simple and interesting interpretation. Con- 
sider a point, such as a' (cf. Fig. 1) on the Z-axis. If r > a\ the expansion 
of 1/s' may be seen to be (see derivation leading to eq. 3-27) 

^ - L (-yPi (cos e); r> a! (7-24b) 

in contrast to (24a). But (25) is of the same form as this; indeed it 
becomes 

To 1 
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if we put — Tq/®- Our final result may now be written 



(7-26) 


provided q is identified with (r^l(i)q. In words: when a conducting 
(earthed) sphere is placed near a point charge +g it changes the potential 
in the same manner as would a point charge of opposite sign and magnitude 
^ placed at the point a = r^ja. The charge q is said to be 

the image of q. 

The same reasoning holds when an earthed plane is placed near a charge. 
For, suppose we put a = ro + A, a' ^ Tq — A' and let go to infinity. 
From a' a = Tq we then get A = A\ and ry/a approaches 1. It is seen 
that the effect of the plane can also be expressed by means of an image 
charge which, in this case, has the same magnitude as the real charge and is 
located at its mirror image. 


Problem. Find the potential of an electric dipole, and of an axial electric quadrupole. 
^ {quaclm^^le} defined as a distribution of Vharge whose potential, while 

vanishing at infinity, is proportional to 
cos 0 r<> « 

Am. c\ — ~ f , (3cos^^ — 1). 

r2 ;.3 


Pi (cos (9)1 
P2(C0S ^)J 


7.8. The Wave Equation. — To give a concise definition of a wave in 
physical descriptive terms is not an easy matter; mathematically it is 
defintxl as the condition of a physical quantity, f/, which satisfies the 
differential equation 

d^U 

rr - 0 (7-27) 


For a reason which will soon he evident, v is called the phase velocity of the 
wave. In general, r may be a function of space coordinates (wave travel- 
ing in a non-homogeneoiis medium). When this is true, oq. (27) has an 
enormous variety of solutions, some of which would hardly conform to the 
more intuitive conception commonly attached to the word wave. This 
general case is of special interest in quantum or wave mechanics and in 
certain branches of optics and will be dealt with in Chapter 11. 

In the present section, v will be considered constant, that is, independent 
of space and time. Before examining eq. (27) by the method of separation 
of variables, we discuss a form of solution which is interesting from a 
physical point of view. For it happens that this equation can be solved 
by the introduction of a single independent variable 

J ^ ax + tiy + yz + vt 
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a, /3, y being constants, provided is written in its Cartesian form. On 
substituting this, (27 ) takes the form 

which is clearly satisfied if we put 

a^ + f + y^ ^ I (7-28) 

Subject to this condition, the substitution 

rj ax + Py + yz — vt 

will also lead to a solution U (rj). The functional form of U is left entirely 
arbitrary aside from the requirement that it must permit of two differentia- 
tions. We conclude, therefore, that 

U =/i(f)+/2(r?) (7-29) 

is a general solution of the wave equation (with constant v). 

Relation (28), however, allows the interpretation of a, /3, and y as direc- 
tion cosines,^ that is, as components of a unit vector, cr. Eq. (29) then 
takes the form 

U =- fi((T • T + vt) + / 2 (o- ^ t — vt) (7-30) 

Now constant values of /i (o* • r + vO are defined by cr • r = —vt; they lie 
on a plane traveling along —or with a velocity v. Constant values of 
/ 2 ( O' • r — vO ^re given by a • r == vt; they lie on a plane traveling along 
+ 0 " with velocity v. The representation (30) therefore describes two 
plane waves traveling in opposite directions with the same speed. 

A solution of equal simplicity may be obtained when (27) is written in 
polar coordinates provided we assume that U is a function of the radius 
vector and t alone. (The solution here derived is therefore far from 
general.) In that case, reduces to d^fdr^ + (2/r)(d/dr), and the 
equation reads 

^ d^(rU) d^U 
r dr^ dt^ 

The substitution ^ = r + vtj rlJ = P converts it into v^{dPP/d^^) — 
{d?P I = 0; hence P ^ j\{r vC), A similar result would have been 
achieved by choosing r; = r — vt in place of f . Hence 

P =fi{r + vt) +/ 2 (r - vt) 

or 

U =^-[fi{r + vt) + / 2 (r - vt)] (7-31) 

r 

^ This interpretation destroys generadity ; a, y need not be real if only they satisfy 
( 28 ). 
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This solution represents two spherical waves, one traveling in toward the 
origin, the other out from the origin. The factor 1/r, without which U 
would not be a solution of (27) and therefore not a wave, accounts for the 
attenuation of a spherical wave as it moves out from its source. 

By suitable choices of /i and /2 a great variety of wave complexes can be 
formed, of which standing waves, dc^fined by the condition U{rd) == 
F(r) • G(t) where F and G represent new functions, are perhaps the simplest. 

Problem. Show that, if /i and /2 are l)oth sme functions, written in the customary 
form sin ( 27 r/X)( r± vt), U represents a standing wave. 

We now turn to a more detailed analysis of the wave equation, based on 
the method of separation of variables. On assuming that 

U = ST 

where S is a function of space coordinates and T a function of i only, 
(27)ns changed to the form 

^ V^S f 
^ S ~ T 

the dots denoting time derivatives. Each side of this equation must equal 
the same constant which, for convenience, we shall call — No supposi- 
tion concerning the reality of co is here implied, although w will turn out 
to be real in the more interesting practical applications. The equation 

r + = 0 

has the general solution 

+ cgc-'"' (7~32) 

The constant co, clearly, has the meaning of an “ angular ’’ frequency. 

Now the space part of the wave function is defined by the equation 

V^S + = 0 

tr 

The constant co/a will henceforth be denoted by k; in terms of the wave 
length X, which is related to oi and v by the well known formula 


k == 27r/X. It signifies the number of waves of given w per 27r units of 
length and is called the wave number. The equation 

V^S + k^S ^0 (7-33) 

is the basis of the entire theory of vibrations and will be referred to as the 
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space form of the wave equation. The remainder of the present section will 
be devoted to its study. 

7.9. One Dimension. — Eq. (33) reduces to the simple form 


d^S 

dx^ 


+ k^S = 0 


which has the solution 

Si = ae‘** + 


One such solution is obtained for every value of k. For k = 0, 
So = az + b. It should be noted that 

‘S = T.Sk 

k 

is not a solution of (33), but that 

' u = T.SkTk 


is a solution of (27). (We are writing Tk in place of because k is 
fixed when co is chosen.) Similar caution is required in all subsequent 
considerations. 

7.10. Two Dimensions. — a. Rectangular Coordinates. The work goes 
as in sec. 7.3. In place of cq. 4 we now have 


X" 

X 


F" 

+ _ + fc2 ^ 0 


Separation is achieved by putting 



and requiring that 
Hence 


ki + ki ^ e 

Sk,k, = XT = 


b. Polar Coordinates. In place of (9) there results 

p" p' 

+ + P^k^ = 0 

On equating to — the radial equation becomes 

p2p" + pP' + (A;V - m^)V = 0 


(7-34) 


It is identical with Bessers (eq. 2-57) when the independent variable is 
taken to be fcp. Hence 
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or, more generally, 

St = (7-35) 

m 

7.11. Three Dimensions. — a. Rectangular Coordinates. Immediate 
generalization of eq. (34) shows that 

St,k,k, = (7-36) 

provided that k\-\-k\ + kl = k"^. If k^, k^, k^ are taken to be real (an 
assumption destroying the generality of the solution) they may be regarded 
as the components of a vector k, and (36) may be written 

S(k) = c(k)e*-' (7-37) 

A\Tien this result is combined with (32) one sees that a solution of the wave 
equation (27) has the form 

k 

or 

U = J'c(k)e’<'‘'±**'>dk (7-38) 

The notation^ used here, which is rather common in modem physics, is to 
be understood as follows: A function of a vector, such as c(k), is simply to 
be regarded as a function of the three real variables fci, k 2 j and k^; dk 
is an abbreviation for the product of three differentials: dkidk 2 dks. Sum- 
mations and integrations over k are therefore threefold. 

Eq. 38 is a very useful form of the solution of the wave equation. 
Physically, it corresponds to the construction of a general wave by super- 
position of plane sinusoidal waves. It also permits initial conditions to 
be included in the calculation rather easily. F'or suppose that we know 
the form of the disturbance at ^ = 0, Uo{x,y,z). The c(k) are then given 
at once by the Fourier analysis of this function, viz. : 

Uoix,y,z) = c(k)c**‘''dk 


and (38) represents the wave at any other time. 


Problem a. Show that, in general, 


U{x,y,z,t) = ( 27 r) ^ Uo(x'y'z')e 


ilk.(r dx'dy'dz'dkidkidkz 


* This notation is indeed ambiguous. In vector analysis, dk is the element of a 
vector, and hence itself a vector. Here it means the element of volume in fc-space, 
which is not a vector. But the convenience of the present notation is so great that we 
shall occasionally employ it when confusion is not likely to arise. 
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If C/o is concentrated at the origin, that is, if Uq is the limit of a function which tends to 
00 at the origin, but in such a way that /// Uo(x,y,z)dxdydz = 1, 


then 


U{x,y,z,t) = (2ir) ® 


///- 


(kT~ i^^^)dkidk2dkz 


Problem b. Show that 

(a) U decreases continually with time at r = 0. 
ifi) U zero wherever | r | > vt. 

Note the physical significance of these results. 


b. Cylindrical Coordinates, The substitutions in sec. 7.4b lead to the 
ordinary differential equations 

Z" = -K^Z 

4 >" = --IH 

p2p" + pP' - [(k 2 - A:2)p2 + 12]P = 0 
The last equation has the solution 

p = ziiVe - K^p) 

Consequently 

Sk.,.i = - K^p) (7-39) 


If this function is to be singlg-valued in (^, I must be an integer. Con- 
structing a solution of the wave equation wherein the space function has the 
form (39) we thus obtain 

C/ = E 

kd 

But it is usually more satisfactory to indicate the nature of the summations 
{I is integral, k and k may vary continuously) more explicitly. If, further- 
more, we limit VA:^ — to real, positive values (thus again destroying 
generality) and call this quantity ju, the following useful representation is 
obtained: 

U = f f ( 7 ^ 0 ) 

J / = — 00 0 


where we have written c = gn for convenience later, and Ji is the Bessel 
function defined in sec. 2.14. 

The type of problem in which eq. (40) is used is this. Suppose that at 
/ = 0, the disturbance is confined to the plane z = 0 where it has the form 
J7o(p,<^). Also, let the wave be monochromatic = const., so that inte- 
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gration over dk is absent). Then 


f/o(p,^) = i r (7-41) 

l=—oo «/0 


and from this relation all coefficients gi(n) can be determined. For if we 
multiply both sides of (41) by and integrate over (p from Oto 27r, we 
obtain 


r 




-L 

27r t/ 0 


Uv{p) 


This, however, is nothing other than a Fourier-Bessel transformation^ of 
Uv (p), and it follows that 


qM = f 

^ n 


Ui(p)Ji{tip)pdp 


Problem. Show that the diffraction pattern due to a plane monochromatic wave 
passing through a circular aperture of radius a is given by 


U (p, 2 ) = const. 


I 




c. Spherical (Polar) Coordinates. The equation for S is similar to (15), 
except that the term +k'^S is also present on the left. The substitution 

S = R(r)-e(e) • 4>(v?) 


now leads to the three equations 

= —7)1^^ 


1 d 
sin 6 do 


(sin ^G') 


" ;“'2 ■_ 0 + ^( i + 1 ) 0=0 
sur 0 


L 1 

r^ dr 


{r^R') + y j R = 0 


(7^2a) 

(7-42b) 

(7-42c) 


The second of these is the equation for associated Legendre functions (I 
and 711 are integers again : m to insure single-valuedness in (p, I in order that 
the solution 0 be a polynomial, i.e., that it should not diverge for 
COS0 = itl). The third equation may be transformed as follows. Put 
U = Pjr^ and change the independent variable to t = kr. Eq. 42c then 
takes the form 


d^P 

dt^ 


+ 


[‘ 




® For further details, see sec. 8.3. 
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Again, put P = Q, so that the last equation reads 


dP 




This is at once recognized as BesseFs equation (2-57); hence 

Q = Zi^i/2(i) 

so that 

E = cr~^^^Zi^i/2(kr) 

For the space part of the wave function, we thus find 
Syt = i:ck,i,rnPT 

ni,l 


(7-43) 


A sum of the form 


L CmPT (cos 0)6™'^ 

m = —I 


with arbitrary coefficients Cm is often called a spherical harmonic and denoted 
by the symbol Yi(6j<p), In using this symbol one must remember that the 
function which it represents is not unique, but contains 2i + 1 arbitrary 
constants. With this abbreviation, then, 

and the wave function is 

t / ^ (dk (7-44) 

k j /=0 


7.12. Examples of Solutions of the Wave Equation. — The local pressure 
P in a gas traversed by a sound wave, satisfies the wave equation. 

a. The simplest type of a wave is that emitted by a “ breathing 
sphere, i.e., a sphere performing volume oscillations without distortion. 
It is characterized by the two boundary conditions : 

(a) Pr=ro = const. 

(/3) Pr-.^ =/(r,«)e‘<*'-“‘> 


Condition (a) states that at the surface of the sphere (of radius ro) all 
points shall be in phase; condition {ff) implies that at infinity the wave 
shall be an outgoing one. We limit ourselves to monochromatic waves 
(pure tones), so that there is only one value of k or w. Clearly, spherical 
polar coordinates must here be used. Considering then eq. (44), we must 
first omit the integration over k. Since in accordance with condition (a) 
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there must, at r = ro, be no functional dependence either on ^ or on 0, 
both I and m are zero. Hence (44) reduces to 

P = Cr^f^Zi/2{kr)e“^* 

But the general Bessel function 

as was shown in Chapter 3. Inserting these, we have 

\ kr hr / 

In order to satisfy condition (/3) we put ai = i, 02 = — 1, obtaining 



as our final result. 

b. When the sphere of the preceding example vibrates, not with spheri- 
cal symmetry, but in such a way that condition (a) reads 

(a) Pr=.rQ = const, cos 

it is said to emit dipole waves. Condition (fi) remains unchanged. Of all 
the functions composing F/(0,(^), only P? (cosO) is a cosine function. 
Therefore I must be 1. Hence (44) now reduces to 

P = Cr-^'^Zs/ 2 (kr) cos 
But 

r“'^%/2(^T) = r-^‘^[aiJsi2{kr) + a2J.s,2{kr)] 
and this is proportional to 

r sin kr cos krl f sin kr cos krl 

L {kr)^ kr J L kr (kr)^ J 

If this expression is to satisfy condition (jS), it is necessary to choose 



The constant C may be complex. If it is written C = Ci + tC 2 , the real 
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part of P, which alone is of interest, will be 

“ (S “ ^ ~ " (1^ Tr) ^ ~ 


For small values of r, 


cos ^ 

SIP = COS (fcr ~ o)t) + Cl sin {kr — co^], 


for large r, 


COS 


SiP == — — [Cl cos (kr — o)t) — C 2 sin (kr — w^] 
kr 


If Cl is zero, the disturbance is of the form cos (kr — o)t) near the sur- 
face of the sphere, but of the sine form at infinity. If C 2 = 0, the 
reverse is true. There occurs, therefore, a curious change of phase as the 
wave moves outward. 

7.13. Equation of Heat Conduction and Diffusion. — The temperature 
C/ in a homogeneous medium, in which A (x^y^z) calories of heat are gener- 
ated (by some unspecified agency) per unit of volume surrounding the 
point (x,y,z) per second, and which has density p, specific heat s, and ther- 
mal conductivity k, satisfies the partial differential equation 


dU 

dt 


-v^u + - 

pS pS 


(7-45) 


Various simplifying conditions may arise: In the first place, attention may 
be confined to ^‘steady states, that is, to temperature distributions which 
do not change with time. Such states will always occur in physical and 
chemical problems after heat conduction has taken place for a sufficiently 
long time. In that case, d U jdt is zero, and the equation reads 

VZ[7 = _ :i (7_40) 

K 


It is of the form of Poisson^ s equation which will be discussed in sec. 7.17. 
If, in addition, it is assumed that no heat is generated anywhere within the 
body, A = 0 and (46) becomes identical with Laplace's equation which 
we have already studied. 

Of greater interest is the situation in which, to be sure, A is taken to be 
zero, but consideration is given to non-steady states. The temperature is 
then subject to the equation 


ps 


W 

dt 


= 0 


(7-47) 
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which is very similar to the wave equation. (This equation is derived by 
vector methods in sec. 4.22.) 

In the kinetic* theory, one meets the equation of diffusion which regulates 
the flow of fluid matter within another material medium. It states in its 
basic form; 

^ = V • (7^8) 

U represents the concentration of fluid matter, D its coefficient of diffusion. 
Strictly speaking, D is a function of U and hence of {x^y,z). But for small 
concentrations D is found to be very nearly constant. For that case, then, 
(48) may be written 

<9 TT 

DV^U - = 0 (7-49) 

at 

All parameters appearing in (49) as well as in (47) are positive, hence 
both of these equations will be written in the form 

a^V^f/ - ^ = 0 (7-50) 

ot 

and we remember that, for heat conduction, U = temperature and 
= k/p 6‘, while for diffusion, U = concentration and = D. The 
remainder of this section is devoted to the solutions of cq. (50). 

Separation may at once be achieved by putting U = S{x^y,z) • T{t)y 
and it is found that a^V^S/S = T/T, On equating the right-hand side to 
— k being an arbitrary constant, it is seen that 

Tu = const, (7-51) 

while S must satisfy 

+ k^S =0 (7-52) 

an equation identical with the space form of the wave equation, (33). If, 
therefore, we combine the solutions of (33), discussed in the preceding 
section, with Tk in the form (51), we have an answer to the problems of 
heat conduction and diffusion. 

7.14. Example; Linear Flow of Heat. — Suppose that heat flows in a 
linear filament placed along the X-axis. The solution of (52) is then 

Sk = 

and this may be taken as 

Sk — 

if we assign both positive and negative values to fc. The general solution 
reads : 

U = ^ 

k 


/ oo 

(7-53) 
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Every choice for c(fc) will satisfy eq. (47), but the proper selection is to 
be made in accordance with initial conditions. Let us suppose, then, that 
U = IJoix) at < = 0. Eq. (53) now states: 


Uo{x) 


/ OP 

c(k)e^^^dk 


and c{k) may be obtained from this by means of a Fourier transforma- 
tion. In view of eq. (8-13)' 

c(A:)=^/ f;o(xV**'dx' 

SO that (53) becomes 


aWt 


dx'dk 


The integration with respect to k can be performed : 


/■ 




- ^/4 




whence 


Uix,t) ^ f {7o(x')e-<*-*'^*'‘‘“*‘dx' (7-54) 

2a V — 


Problems. 

a. Prove that (54) reduces to Uq(x) for t - 0. 

b. Show that, if Uo{x) is a step function such that 


''•-{Ji 

U 

2 

v; 


a: < 1 

X >1 




/' 


-Pdi. 


where 4>(x) is the error integral 

c. Show that, if 

-K ::: 

Interpret the last two problems from the point of view of diffusion. 

d. Suppose 17o is a function ” which is everywhere zero except at x == 0, where it 

tends to 00 in such a way that J' Uo{x)dx - 1. (Such a “ function ” was introduced 

by Dirac and is commonly known to physicists as a 5-function. Strictly speaking it is 
no function at all.) Then, clearly, 

U = — 

2aVH 
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Discuss the temperature at any point x, and show in particular that it will rise to a 
maximum at t = x^/2a^. This fact affords a simple experimental determination of o, 
and hence of D and the thermal quantities. 


7.16. Two-Dimensional Flow of Heat. — In polar coordinates, S is given 
by eq. (35) Hence 

U ^ f c(k)dk(ZaMkp)e'”ne-“''‘'^ (7-55) 

— 00 m 


If, as we shall suppose, the temperature distribution at ^ = 0 is radially 
symmetrical, so that U does not depend on (^, the only value permitted to m 
is zero. Also, since Zq is an even function, the integration in (55) may be 
taken from 0 to oo without error. For Zq we shall take the Jo-function, 
because it will at once be seen that most temperature distributions can be 
expressed in terms of Jo alone. Thus 


Let us write 


U = / c(fc)Jo(fcp)e"“’*’‘dfc 

•^0 

c(k) = kg{k) 


(7-56) 


and suppose that U = Uo(p) at i = 0. It is then easy to determine gik) 
formally and hence U (p,<). For in accordance with (56) 

Uo(p) = / g{k)Jo{kp)kdk 

in other words, gik) is the Fourier-Bessel transform of t/o(p)- (Cf. Sec. 8.3.) 
Hence 

gik) = f Uoip)Joikp)pdp' 

When this is put back into (56) the final form of U ipd) is obtained: 

U{p,t) = f Uo{p')Mkp)Jo{kp)e-<‘''‘"%'dkdp' (7-57) 

CO 

I'o(p)pdp = 1, 

0 


Compare this with problem (d) of Sec. 7.14. Interpret above as a diffusion problem. 

7.16. Heat Flow in Three Dimensions. — In rectangular coordinates, U 
is given as a generalization of eq. (53) (cf. eq. (36) for the form of S ) : 
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or, with the use of the vector notation previously explained (cf. footnote 
on p. 227) 

u = f f f (7-58) 

We now repeat essentially the procedure leading from (53) to (54), but 
using three variables instead of one. 

Uo(x,y,z) = f J J c(k)e*"cik 


hence c(k) is the Fourier transform of [7o- 

c(k) = §73// f \ 

whence 

Uix,y,z,t) f f f f f Uo(x\y\z')e''‘-^^-^'^-'^''‘'^dT'dk 

= {2aV7tr^ J J J Uo{x,y/)e-^^^'^"’*'‘"^dT' (7-59) 

If C/o is a function of x' alone, the integration over y' and may be per- 
formed, and the result is identical with (54), as it should be. Of greatest 
practical importance is the case where Uo is a function of r alone. The 
volume element dr' may then be written in polar form: r'^dr sin B'd6^dip\ 
and the integration over 6^ and can be performed. It is to be observed 
in this connection that 

(r — r')^ = — 2rr' cos S 

One then finds 

U{r,t) = {2arV^t)-'- f r7o(/)[e“<’^’''^’''2“'^ - ]r-'dr' 

0 


Problem. 


where 


Show that, if 

Uo 


fl for r S \ 
|o for r > 1 


V = ^[0tt+) + +~^[- (e-«+-e-«-) 

r \ TT 


1 =b r 2 




7.17. Poisson’s Equation. — ^All partial differential equations treated 
thus far in the present chapter are linear and homogeneous in the dependent 
variables (cf. footnote on p. 45). It is only for this type of equation 
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that the method of separation of variables may work. The variety of 
linear and inhomogeneous equations of importance in scientific analysis 
is also great, but there exists for their solution no method nearly so 
powerful as the separation of variables. 

An equation like (33), the space form of the wave equation, would 
become inhomogeneous if the right-hand side were not zero but some 
function /(x,(f/, 2 :). One remarkable feature of an inhomogeneous equation, 
which will here only be mentioned, is that it may not possess solutions for 
every value of k even though the homogeneous equation, with the same 
boundary condition, has solutions. The inhomogeneity selects, as it were, 
special values of the parameter k for which solutions are possible. This 
phenomenon, which is the rule for inhomogeneous equations, may also 
occur for homogeneous ones if the boundary or initial conditions of the 
problem arc sufficiently stringent. It will be discussed under the heading 
characteristic values " or “ eigenvalues " in Chapter 8. 

An inhomogeneous equation which is rather common is Poisson^s; 
it will here be chosen to illustratt^ a process of solution. Its general form is: 

=fix,y,z) (7~60) 

One encounters it (1) in electrostatics, where $ is the ordinary potential 
and/ represents a constant times the distribution of charge,^ p(a:,7/,0), the 
constant depending on the units chosen; (2) in the theory of heat flow, 
where eq. (45) takes the form (60) when dU/di - 0, as shown in eq. (46). 

To solve (60) we first recall Green's theorem (see sec. 4.19) which 
states that, for any two functions of space coordinates, u and v which are 
finite, continuous and have continuous first and second derivatives. 



{uVv — v^u) • da 


(7-61) 


Here r represents a certain closed volume and or its surface; da is taken 
positive in the direction outward from the volume. In our problem we are 
given the function /(x,^, 2 ;) and we wish to find ^{xyz) for a fixed point of 
observation {xyz). In the following it is necessary to distinguish 
between this fixed point, which will be denoted by primes, and the variable 
point {xyz) over which integrations are to be performed. 

It will prove convenient to consider, in connection with theorem (61), a 
volume T such as that depicted in Fig. 2. It is bounded by the outer sur- 
face 0*2 and the inner surface (ii, a spherical cavity of radius sq about the 
fixed point P'. The function u will be specified to be 



® If p = 0 the equation reduces to Laplace's, as pointed out in sec. 7.2. 
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it satisfies Laplace's equation = 0, as may readily be verified. Then 
eq. (61) reads: 



(7-62) 


If now we interpret v as we may replace by / in accordance with (60). 
The right-hand side of (62) consists of two integrations, one over dcri and the 



Fig. 7-2 


other over d(T 2 - Consider first that over rf<ri . Clearly, V ^ • dai approaches 
— d^/dr\pfd(r] as sq tends to zero, the minus sign coming from the fact 
that dai is inward with respect to the cavity. Hence 



provided has a finite derivative. The second integral on the right of 
(62), when taken over <xi becomes, in the limit as sq 0, 



dxr — ^ 

p 
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Hence, if it is assumed that sq — » 0, eq. (62) reduces to 

fj-dr = + J - f>V • da 


(7-63) 


Here the remaining integral over (T 2 on the right has a rather simple mean- 
ing. It is a solution of Laplace’s equation in the form : = 0, 

for the only quantities which depend on the primed coordinates are 1/s 
and V(l/5), and these clearly satisfy it. Hence if this whole integral were 
subtracted from the remainder would still satisfy eq. (60). It is indeed 
easily seen that 




• du 


represents the contribution to ^ coming from those parts of which 


lie outside of r. 


In the electrostatic 


case, 


represents the potential due 


to the charge outside of the volume t considered. 

The integral over a 2 may be eliminated in another way. Suppose we 
allow T to become infinite and impose on ^ the boundary condition that, at 
infinity, it vanish at least as strongly as 1/r. Then and (l/.s ) are 
both of order 1/r'^ at oo, and after the surface integration, which amounts 
to multiplication by r^, the result will still be of the order 1/r and hence 
vanish. 

Of interest, therefore, is chiefly the particular solution which remains 
when the integral over 0-2 in (63) is omitted; it is usually referred to as the 
solution of Poisson’s equation. Thus 




jr — r'l 


dxdydz 


(7-64) 


Problem. Show that, when / {x^y.z) is different from zero only within a finite volume 


TO such that 


I «= Qf then for any point {x'y'z) far removed from to, 




4ir / 


the origin being chosen inside tq. Interpret this result in electrostatics. 



CHAPTER 8 


EIGENVALUES AND EIGENFUNCTIONS 

8.1. Simple Examples of Eigenvalue Problems. — It frequently happens 
in mathematical analysis that a given equation, or a set of equations, 
yields solutions which are in general uninteresting or trivial, except when 
a certain parameter appearing in the equations is given a definite value. 
Such circumstances give rise to eigenvalues^ and eigenfunctions.^ Their 
occurrence is so common that it often goes unrecognized. For illustration, 
let us take a very simple (and useless) example. 

Suppose one wishes to solve the two simultaneous equations 

(1 - \)x + 2y =0, 2x+ (1- \)y = 0 

To be sure, they always possess solutions; but they are almost always 
X = Of y = 0, Only for two values of the parameter X will this not be true: 
forX = 3 the solution is X = y; forX = —litisx = —y. (The numerical 
values of X or y are of course never fixed by the linear homogeneous equa- 
tions above.) The two values of X for which the equations possess non- 
vanishing solutions are said to be eigenvalues; the two corresponding solu- 
tions are called eigenfunctions. 

Eigenvalues are not always denumerable and discrete, as in the fore- 
going example. To show this, we choose an even more trivial illustration. 
The equation 

x^ =X 

always possesses a solution. If, however, we wish a real solution, X is at 
once limited to the domain of positive numbers. Hence we may properly 
say that x^ = X is an equation leading to eigenvalues: X S 0 and corre- 
sponding eigenfunctions x = Vx. 

In both examples eigenvalues were called into being by the imposition of 
special conditions: in the first that the solutions shall not vanish every- 
where; m the second that the solution shall be real. This is generally true; 
eigenvalues are always produced by special requirements placed upon the 
solutions of equations. In the most interesting cases of physics and chem- 
istry, these equations are differential or integral equations, and the con- 

^ The terms eigenvalue and eigenfunction, because of their brevity, appear to be 
rapidly displacing their classical synonyms; characteristic value and characteristic func- 
tion, at least in the physical and chemical literature. 

240 



241 


VIBRATING string; FOURIER ANALYSIS 


8.2 


ditions are boundary conditions. We now turn to some cases of greater 
scientific interest. 

8.2. Vibrating String; Fourier Analysis. — In classical physics, many 
eigenvalue problems occur in connection with vibrating systems. The 
simplest of these is the problem of a vibrating string. Consider the string 
to extend along the X-axis, to be fastened with its left end at the origin and 
its right end at rc = Z. From elementary physics we recall that if its mass 
per unit length is m and its tension T, the speed of waves along the string 
is given by t; = ^^TJm, The wave equation, discussed in sec. 7.8, will 
then read 


V 


2 


d^U 

dx^ 



( 8 - 1 ) 


U is the vertical displacement of the points along x. 

We restrict our attention for the moment to types of vibration having a 
single frequency v, or angular frequency co = 2TrVy so that U = S(x)e'"*or, 

( sin wA 

or I • The 
cos (jiiJ 

function S will then satisfy the ordinary differential equation 




+ = 0 


(8-2) 


where, in conformity with the usage of Chapter 7, the abbreviation 


k = 


V 


2ir 

Y 


(8-3) 


has been used. Here X stands again for the wave length of the disturbance 
produced. The general solution of (2) is, clearly, 

S =- A sin"(fcx + d) (8-4) 

where A and 5 are arbitrary constants. Every solution of the form (4) is 
perfectly acceptable as far as the differential equation is concerned, but it 
does not describe the behavior of the string. Solution (4) permits the ends 
of the string to vibrate, whereas the physical condition requires them to be 
fixed. It is therefore necessary to impose the following boundary condi- 
tions upon the solutions (4) : 

(a) S(0) = 0 

(b) S{1) =0 

Both can of course be satisfied by putting A - 0, but this would lead to 
the unwanted solution t/ = 0 everywhere. Hence there is only the second 
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arbitrary constant, 5, left for adjustment. It must be taken to be zero in 
order to satisfy condition (a). (Choice of tt, 27r, etc., leads to the same 
final result.) But the function S - Asia kx will not obey (b). Thus the 
problem can be solved only if we are willing to tamper with k: we are led to 
eigenvalues. If sin kl is to be zero, k must be 0, or tt/Z, 2Tr/l • • • mr/l. The 
value 0, however, is excluded for the same reason that A = 0 was rejected. 
To each eigenvalue of fc = ut/I {n integral), there corresponds an 
eigenfunction aSh = An sin riTrxfL These eigenfunctions are of course 
undetermined with respect to the constant multipliers. An, which may be 
chosen at will, and which may be different for every n. 

Since k is related to X by eq. (3), there is thus generated a corresponding 
set of eigenvalues for X, namely X = 2Z/n, n integral. This is the well 
known equation for the wave lengths of standing waves supported by a 
vibrating string. In the simplest mode of vibration, corresponding to the 
fundamental frequency, X = 2Z, the string has nodes only at the end points. 
For the first harmonic, X = Z, there is in addition a node at the center of the 
string, and so on. In general, the number of nodes is n + 1. 

The eigenfunctions under consideration have two important properties 
which, as we shall see in sec. 8.5 et seq., are common to a large class of 
eigenfunctions arising in connection with different problems. They are 
(1) orthogonality^ (2) completeness. To explain the meaning of these terms, 
let us arrange the eigenvalues of eq. (2) in a definite order, kn = n7r/Z, 
n = 1, 2, 3 • • • ; and write again Sn = An sin mrxIL Orthogonality 
means : 


/ 

0 


(^) {x)dx — CfiSfi 


(8-5) 


The word comes originally from vector analysis (cf. Chapter 4) where two 
vectors, A and B, are said to be orthogonal if A ‘ B = + Aj^By + 

AgBg = 0. Similarly, vectors in N dimensions having components A^, Bi 

(i — 1, 2, • • • A^) are said to be orthogonal when AiBi = 0. If now we 

i = 1 

imagine a vector space of an infinite number of dimensions, in which the 
components Ai and Bi become continuously distributed and everywhere 
dense, i is no longer a denumerable index but a continuous variable (x) 

and the scalar product 2 AiBi turns into J* A (x)B(x)dx. If it is zero, the 

functions A and B are said to be orthogonal, and this is the sense in which 
the word is used above. 

The idea of orthogonality is indefinite unless reference is made to a 
specific range of integration, which in the present case is from 0 to Z. Here 
the validity of eq. (5) is at once verified on substitution of the S-functions, 
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and the constant Cn is seen to be 


t2 *2 


xdx 


^ ^nir 

= • — / sin^ udu 

nirJQ 



For many purposes it is convenient to have Cn equal to unity. This 
can always be achieved by a suitable choice of An. In the present case, 
every Cn = 1 if An == ^ 2/1. If, therefore, we write Sn = \^^2/i sin tittz/I^ 
the orthogonality relation (5) reads 


/ 

V 0 


Sn{x)Sm(x)dx = 5n 


(8-5') 


When the constants An are thus chosen the eigenfunctions are said to be 
normalized; functions satisfying (5') will henceforth be termed ortho- 
normal. It is clear that a set of functions having the property of orthogon- 
ality (expressed by eq. 5) can always be made ortho-normal by a proper 
choice of multiplicative constants. 

A simple modification in the idea of orthogonality is to be made when 
complex functions are considered. For these, condition (5) must be re- 
placed by 

J S*(x)S„(x)dx = Cn5„„ (8-5*) 


where S* represents the complex conjugate of S. This definition will be 
used in later work. 

We turn to the second property, that of completeness. A set of functions 
is said to be complete if an arbitrary function, /(x), satisfying the same 
boundary conditions as the functions of the set, can be expanded as follows: 

fix) = ianSnix) (8~6) 

n = l 


the On being constant coefficients. 

In the present instance, eq. (6) is equivalent to the theorem of Fourier 
which states, in its simplest form, that a function /(x) which vanishes both 
at X == 0 and at x =■ tt (and has but a finite number of finite discontinuities) 
maf^ always be written^ 


fix) = sin nx (8-7a) 

n»l 

^ See, for instance, Byerly, W. E., “ Fourier Series and Spherical Harmonics," 
Ginn and Co., 1893, p. 38. The formulas 7a, b are special cases of eq. 42, developed 
later in this chapter. Note also the more precise definition of completeness given in 
sec. 8.8. 
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the coefficients being given by 

On = - f /({) sin (8-7b) 

IT «/ 0 

Eqs. (7) may be modified by using, in place of x and the variables ttx/I 
and tt^/L This has the effect of changing the range of x from (0,7r) to 
(0,Z), and the results are 

f{x) =2anSin^ya:^ (8-8a) 

On fm sin (y (8-8b) 

If 5 is taken in its normalized form, these equations read simply 

/(*) = ^a„s„(x), Or, = f f(i)Sn(^)d^ 

n = l 0 

The fact of completeness has an important bearing on the problem of the 
vibrating string which we originally set out to solve. While it is true that 
only a particular Sm for which k assumes a specific eigenvalue, is a solution 
of eq. (2) [the series (6) would not be a solution of (2)1 ], the value of k is 
not prescribed by eq. (1). Hence eq. (1) is satisfied by 

[/ = ]C CnSn{x) COS 0)n = vkn 

n 

with arbitrary coefficients Cn. This, then, is the most general solution of 
the string problem. It reduces to a series like (6) for ^ = 0, a series which 
can be chosen to represent any function /(x) which vanishes at the end 
points. Hence it is seen that any initial configuration of the string will 
yield a solution of eq. (1), that is, a (standing) wave. 

Fourier analysis is so useful a tool in applied mathematics that it seems 
well here to digress for a moment and summarize its essential features 
beyond the needs of the present problem. Details may be found in the 
book by Byerly already mentioned. The general theory, including proofs 
for the statements here made, will be found in Secs. 8.5-8.8. A function 
f(x) defined between x = 0 and x = tt may also be expanded as a cosine 
series: 


fix) = ^6o + £ bn cos nx 

n = l 

(8-9a) 

2 r’ 

bn 1 f(() COS n(d( 

(8-9b) 


where 
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except that the series may not yield the same values as/(x) at discontinui- 
ties and at the end points. Otherwise the developments (7) and (9) are 
equivalent. There is, however, an interesting difference in the values of the 
two series when they are extrapolated to the range ( — 7r,0) . Here the series 
(7) changes sign in such a way that/(— x) = —fix), while series (9) yields 
f(—x) = f(x)f as is evident from the fact that sin x is an odd, cos x an even 
function. Thus, if it is desired to expand a function between — w and +7r, 
series (7) can be used only if the function is odd, series (9) when it is even. 
Now any function can be represented as the sum of an even and an odd one. 
Hence, if an arbitrary function /(x) is to be developed between — tt and tt, 
both cosine and sine series must be used. It is evident, therefore, that in 
this more general case 

00 00 

/(*) = Eonsinn* + 560 H-L^’ncosnx (8-lOa) 

n«»l n=al 

where 

On = - r /(?) sin &„=-/* /({) cos (8-lOb) 

The coefficients in front of the integrals are most easily checked as follows: 

Multiply (10a) by sin mx and integrate over x between — tt and tt. Because 
of the mutual orthogonality of the functions sin nx, sin mx, cos nx, for n 5 ^ 
m the relations (10b) are at once apparent. 

If /(x) is defined, not in the range ( — 7r,7r), but in ( — 1,1) ^ a simple change 
of variable from x, ^ to {tt/1)x, (ir/l)^ in eq. (10) will produce the required 
modification. The result is 

. ^ J{x) = sin ^ X + |6o + cos ^ x 

n I n t 

On = ^ J' /(I) sin Y ^ T 

This may be expressed more simply in complex form. For if the sine and 
cosine functions are written in their exponential form, the reader will verify 
without difficulty^ that 

/(a:) = Lcnc'"”'', «» = (8-12) 

The coefficients Cn in this expansion are complex. 

When the series /(x) as given by (12) is extrapolated beyond the range 
(-‘1,1) the function f(x) is repeated periodically in every interval between 

* Note that and are orthogonal functions in the sense of eq. 5 *, the 

range of integration being (— 1 , 1 ). 
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(2n + l)l and (2n + 3)i. Hence formula (12) permits representation of 
periodic functions only. One may wonder, therefore, whether it is possible 
to perform a Fourier analysis of a non-periodic function, defined in the 
range of the entire real axis. Highly technical considerations for which the 
reader is referred to more specific treatises^ affirm this possibility, provided 
the function, /(x), to be expanded is piecewise continuous and such that the 

integral J* |/(x)j dx exists. In that case 

fix) = J c(k)e''’^dk 

cik) = ^ J (8-13') 


These equations may be written more symmetrically by putting c{k) = 
(1/V^)^(A:). They then become 


fix) 


gik) 


^ f gik)e'’‘^dk 

^'rr J - CO 






(8-13) 


Two functions/ and g related by eq. (13) are called a pair of Fourier trans- 
forms; i.e., g is the Fourier transform of/ and vice versa. Such pairs are of 
great importance in the analysis of electrical impulses^ and in quantum 
mechanics, where they effect the transformation from coordinate to mo- 
mentum space. 


Problems. 

a. Show that the Fourier transform of f{z) - is g{k) = (This fact is 

occasionally expressed by saying : the error function is its own Fourier transform. ) 


b. Show that the F.T. of the step function /(x) = 


^ .,11 , 
<l 

0 if I X I > / 


^ g{k) « Note: as I approaches zero, /(x) becomes « at x = 0. It is then 


called a “unit impulse ” function, or a 6-function, Its transform g(k) * 1. 


^ E.g.jTitchmarsh, E. C., “ Introduction to the Theory of Fourier Integrals,’* Oxford 
University Press, 1937. 

® For further considerations see v. KtenAn, T. and Biot, M., “ Mathematical 
Methods in Engineering,” McGraw-Hill Book Co., 1940, An extensive list of Fourier 
transforms has been compiled by Campbell, G. A., and Foster, R. M., “ Fourier Integrals 
for Practical Applications,” Bell Tel. Syst, Tech. Pub. Monograph B-’584, 1931. 
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c. Show that the F.T. of a “ j&nite wave train f(x) 


- cos Ajqx if X < I 


if X > I 


sin l(ko “ k)l] 

is • 

ko-k 

The Fourier Integral Theorem may be deduced immediately from (13). 
On putting g{k) into the integral for/(a:), there results 


f(:x) = ^ J J 


(8-14) 


When/(x) is real, the imaginary part of e^^^^ may clearly be neglected, 
and the Fourier integral theorem takes the more customary form: 

fix) =:~J dkj m cos kix - {)df (8-14') 

Finally one may derive from (14) a result, sometimes called the Dirichlet 
integral, which is of considerable utility in numerous problems. On per- 
forming the integration over fc in (14), not between infinite limits but be- 
tween — A and vl, and then passing to the limit ^4 — > oo we find 


fix) = - lim r /({) 

TT ^ '■ ► 00 ^ — CD 


sin [4 ix — {)] 

( 

X - 


(8-15) 


Rigor, which is lacking in this simple derivation, may be supplied by 
more subtle considerations.® As a special form of (15) we note: 

1 . r" , . sin Ax , 


/(O) = - lim r fix) — 

TT ^—>.00 CD ^ 


All the foregoing results can be generalized to permit expansion of func- 
tions of several variables, provided they satisfy the condition 


J 1 fix,y,z • • •) I dxdydz 


exists. For instance, in place of (12) we have 




^m,n^ 


iiirlDimx+ny) 




(8-16) 


® See Courant, R., ^‘Vorlesungen tiber Differential- und Integralrechnung/’ Volume 1, 
Second Edition, p. 373. 
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and in place of (13), 

- IfL. 

g(kM f 


(8-17) 


8.3. Vibrating Circular Membrane ; Fourier-Bessel Transforms. — The 
mathematical description of the vibrating membrane also leads to an inter- 
esting eigenvalue problem. The wave equation, when written in polar 
coordinates, was shown in sec. 7.10 (cf. eq. 7-35) to have the solution 

C7 = S • T, = Zm(kp)e^^^^ (8-18) 

The fact, pointed out before, that m must here be an integer to insure the 
function to be physically meaningful must be the same as 

because <p and (p + 2ir denote the same angle in the problem of the mem- 
brane) may also be expressed by saying: the eigenvalues of m in the dif- 
ferential equation arc all integers. Note that the corre- 

sponding eigenfunctions, are orthogonal and form a complete set, the 
range being (0,27r). But we wish here to discuss another, less simple 
eigenvalue problem. 

Consider modes of vibration of the membrane which have circular 
symmetry. This limits m to the value zero, and (18) becomes 

Sk = Zo(fcp) (8-19) 

We now impose the boundary condition: [/ = 0 at all times at the periph- 
ery of the membrane, corresponding to the physical condition of having the 
edge fixed. If the radius of the membrane is a, this means 

Zo{ka) = 0 (8-20) 

The function Zq is a linear combination of the Bessel functions Jo and Wo, 
a Bessel function of the second kind which is linearly independent of Jo 
(sometimes called a Neumann function). But the latter may be shown to 
be infinite at p = 0 and must therefore be excluded. The Zo in (19) and 
(20) must therefore be interpreted as Jo. To satisfy (20) the parameter k 
must be so adjusted as to make ka a root of Jo, and since Jo has an infinite 

number of roots,^ the eigenvalues of k will form an infinite set ki = Xt/a, 

where Xi is the t-th root of Jo{x), The corresponding eigenfunctions 
are Jo(fctp). 

^ The values of the roots of Jo(x) are listed in books on Bessel functions. See also 
Jahnke, E., and Emdte, F., “ Funktionentafeln,’' Teubner, 1933. 
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X 


Are these functions orthogonal? It is not difficult to show that 

z 

Jo{J^iP)Jo(k 2 P)dp is different from zero (an inspection of the graph of 


the integrand will convince the reader). Thus it seems that eq. (5) fails 
in this example. But we have overlooked an important feature: the ele- 
ment of area of the circular membrane is not dp, but 27rpdp. And now it 
will be found that 


/ 

•^0 


«Io(^mP)«Io(^nP)pdp — 


( 8 - 21 ) 


As the present problem shows, specification of a range of integration is 
not sufficient in defining orthogonality of functions; it is also necessary to 
state the weighting factor associated with each differential range of the co- 
ordinate. In the problem of the vibrating string, the weighting factor w(x) 
happened to be unity; here it is w(p) = p. In the next example it will be 
seen to be p^. The same w which appears in the orthogonality relation will 
also occur in the integrals defining expansion coefficients (cf. e^j. 42). 

To prove eq. (21) for m n we use the last of the formulas in sec. 3.9, 
according to which the left-hand side has the value 0 because both Jo{k\a) 
and Jo{k 2 a) vanish. According to another formula in this list, 

lJo{kfip)]^pdp = ^ liknd) 

0 ^ 


But in view of eq. (3-69), J^\ = — Ji, so that the constant Cn in (21) has 
the value (a^/2) [Ji(fcn«)]^. 

The question of the completeness of the functions «/o(^np), i e., the pos- 
sibility of the expansion 

/(p) = ianMknP) (8-22) 

n*l 

will be investigated in sec. 8. We shall here anticipate completeness pro- 
vided, of course, that/(p) vanishes also at p = a. Granting this, the co- 
efficients an may be computed in the manner already illustrated in connec- 
tion with Fourier series: 

Multiply both sides of (22) by Jo(kmp)pdp and integrate. The result 
is, again in view of (21), 

a^ 

j f{p)Jo(.kmp)pdp = [Jl(kfna)]^ 

If we use the normalized function Sn = ('^/«)[«/i(A:na)]~^d’o(fcnP), the 
expansion reads 

/(p) ^ianSnip) 

1 
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and the coefficients are 

= I f{p)Sn(p)pdp 

«/0 

The problem of the circular membrane has been simplified by our 
assumption of circular symmetry. One may wonder what happens if 
types of vibrations are permitted in which the displacement is a function of 
both p and (p, for these certainly occur. It is then necessary to use the func- 
tion Sk,m defined in eq. (18). These may easily be seen to be orthogonal 
with respect to both indices, i.e., 

fpdp r = 

^0 VO 

Moreover it is possible to expand 

fM = j:anMk„p)e^*’ 

n,m 

The details of this development may be left as an exercise to the interested 
reader; they are worked out fully in some works on sound.® 

The condition f(a) = 0, upon which the expansion (22) was based, 
may be removed; the range of integration must then be extended from 0 to 
00 . Now it is clear that, as a — » oo , the values kn move closer and closer 
together. In the limit they will, in fact, form a continuum. When the 
passage to this limit is performed, eq. (22) becomes what is known as a 
Fourier^ Bessel integralj^ an equation which is useful in the theory of radia- 
tion. While the transition to the limit is difficult, the result may be 
obtained quite simply by a method used by Stratton, which will here be 
given. 

Suppose /(x,y) can be expanded according to eq. (17). In these 

equations, we transform the variables of integrations to polar form: 

X = p cos (p, y ^ p sin <p; ki == k cos a, /c 2 == fc sin a 
They then read 

=:r f f '(8-230) 

ZttJo Jo 

g(ka) = ^ f pdp (8-23b) 

ZwJo VO 

® See particularly Morse, F. M., Vibration and Sound, McGraw-Hill Book Co., 
1936, p. 163 et seq. 

* We are here following a terminology which seems to be gaining ground, although we 
have been unable to discover its origin. It appears that relations of the form (24) were 
first discovered by Hankel. 

Stratton, J. A.,- Electromagnetic Theory,” McGraw-Hill Book Co., 1941. 
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We now take for/(p,(j<?) the special function The integration over 
<p appearing in (23b) may then be performed with the use of eq. (3-72a), 
according to which 


ApcosC^— a)]^^ _ I ^i[m<p-kmn(<P-a-^r/2)]^ 

0 Jo 

J '*2ir 

^.Im 9 -fcp.m 9 | ^ ^ 2 e^m(a-rl 2 ) I 

0 Jo 

= ■ J„ikp) 

Thus 

g{k,a) = f /(p)J„(A:p)pdp • g(k) ■ e<»(»-W2) (8_24b) 

On putting this answer into (23a) we find 

= — r g(k)kdk f <>*lm(‘»-ir/ 2 )+A:pco,(^a,)l^^ 

^TT J Q Jo 

- f g{k)kdk-2TTe'”"^J„,{kp) (8-24a) 

27r J Q 


These results may be expressed in the symmetrical form 

/(p) = r q(k)Jm{kp)kdk 

Jo 

Q{k) = / f{p)Jrnikp)pdp 

^ 0 


(8-24a) 

(8-24b) 


The functions / and g satisfying relations (24a, b) are said to be a pair of 
Fourier-Bessel transforms. It is to be noted that the expansion (24a) of 
the function /(p) holds for every value of the integer m, Eq. (22), there- 
fore, is a special case of a Fourier-Bessel expansion. 


Problem a. Show, using the formulas of sec. 3.9, that the Fourier-Bessel transform of 
/(p) = p^, with respect to Jn, is 



Problem b. Verify the identity 

m f f }(p)Jm{kp)Jmmpkdpdk 

«/ 0 Jo 
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8.4. Vibrating Sphere with Fixed Surface. — The problem of a sphere 
vibrating with a node at its surface is of little interest in acoustics, for if 
there is never any displacement at the surface, the sphere cannot radiate. 
However, the same problem, interpreted quantum-mechanically, describes 
the motion of a particle within a spherical cavity and has as such enjoyed 
some attention in nuclear physics. For the sake of simplicity, we shall 
here maintain the acoustic interpretation. 

The solution, S, of the space part of the wave equation was shown in 
eq. (7-43) et seq. to be of the form 

Sk == (8-25) 

As usual, k determines the frequency of the vibration : v = fcy/27r, v being 
the velocity of the waves inside the spherical medium. Eigenvalues in k, 
and hence in the frequency spectrum,” are induced by the boundary 
condition 

Sfc = 0 at r = ay the radius of the sphere 

According to (25), this is satisfied only if Zi^ii 2 {ka) = 0. Thus, for every 
integer Z, there exists an infinite sequence of ki such that kia is a root of 
Z 1 ^ 112 - But Z 1^112 is a linear combination of Jij^ii 2 and of which 

only the former can be retained because is always infinite at 

r = 0 and does not, therefore, represent a possible mode of vibration. 
Hence it is 

J^i4-i/2(^a) = 0 

which determines the eigenvalues of k, 

WhenZ = 0 the situation is very simple indeed, for /i/ 2 (a;) = 'N/2/7rxsina:. 
Thus the fc^s are determined by sin {ka) = 0, which means that for this case 

, nir 

kijn = — ' ^ an integer 

a 

The frequency spectrum is much the same as for the vibrating string. Let 
us now see what is the physical meaning of the condition 1=0, A glance 
at eq. (25) shows that Yi{dj(p) is a constant, and this means there are no 
radial nodes. The sphere vibrates in spherical symmetry. 

In addition to these eigenvalues, which have a linear distribution, there 
are the other sets given by Ji^i 12 (^ 1 , n^) = 0. These are irregularly dis- 
tributed and interspersed between the fcon. 

The orthogonality of the Sk (eq. 25) is at once evident. Orthogonality 
with respect to the index Z arises from a property of the spherical harmonics 
proved in sec. 3.53. But even for the same Z and different k the functions 
retain their orthogonality. The weighting factor in this case is because 
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the volume element contains this factor. Thus 

J Sk^Sk/^dr oc f Ji+ii2(lcir)Ji+ii2(.hr)rdr 

0 J 0 

an expression which vanishes unless ki = fc 2 is seen from the last formula 
of sec. 3.9. 

By more special considerations it may also be shown that the set of 
functions is complete in the sense that any f{rfi^(p) which vanishes 
at r = a and is piecewise continuous can be expanded in the form 

For the special case I = 0 this expansion 

l n 

reduces to a Fourier series. 

Problem. Compute the lowest 12 eigenfrequencies of the vibrating sphere. 


8.6. Sturm-Liouville Theory. — Deeper insight into the nature of 
eigenvalue problems which arise in connection with second order differen- 
tial equations is obtained from a study of a theory at once simple and 
beautiful, the theory of the Sturm-Liouville equation. Nearly every 
eigenvalue problem encountered in physics and chemistry leads to an 
equation of the general form 

L(u) + \wu == 0 (8~26) 

where the differential operator L is defined by 

L{u) = {puY — qu (8-27) 

The quantities p, g, and w are understood to be functions of the independent 
variable x, and we shall suppose that Wj which will soon be recognized as 
the former weighting function, satisfies 

w(x) ^ 0 

in the entire range of the variable x. This range is different in different 
problems, but it will be assumed to be finite and to extend from a to b. 
Finally, X is a constant; it will turn out to be the eigenvalue parameter. 

An operator^ ^ of the form (27) is said to be self-adjoint. The necessary 
and sufficient condition for the general second order differential operator 

D(u) = fu' + gu' + hu 

(in which /, g, and h are functions of x) to be self-adjoint is simply that 
g = Eq. (26), however, is not a very special one. Every second order 

For a general definition of an operator and its adjoint the reader is referred to 
Courant-Hilbert, ‘‘ Methoden der Mathematischen Physik,” Second Edition, Vol. II, 
p. 434, or Frank, P. and v. Mises, R., “ Differentialgleichungen der Physik,'' Vol. I, 
p. 780. 
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Of Sturm-Liouville type when written in the form 
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differential operator D{u) can be made self-adjoint; it need only be multi- 

'g-f 


plied from the left by exp 


P 


f 


dx. Thus all differential equations 


encountered in Chapter 2 may be written in self-adjoint form, and the 
theory we are presenting applies to them all. In Table 1 we list the factor 
F by which the equation named on the left, written in the customary form 
in which it appears in Chapter 2, must be multiplied in order to be self- 
adjoint, and also the quantities p, g, and w in (26). 

The function u is subject to boundary conditions. In the examples of 
the preceding sections these were of different types: in the problem of the 
string every u had to vanish at both end points, in the other problem it 
was to be finite at r = 0 but zero at r = a. Examination of these and 
many other examples (see Chapter 11) will show that the boundary con- 
dition in most problems of interest may be expressed in the uniform way 


puu I = puu j = 0 

for usually either por uor u' vanishes at the end points of the range. But 
it is equally satisfactory to state these conditions in a somewhat milder 
form : Let u and v be any permissible solutions of eq. (26) ; we then require 

vpu I = vpu I (8-28) 

Jo p 


On the basis of this condition it is possible to establish the important 
theorem : 

J" vL{u)dx - J* uL{v)dx (8-29) 

The proof is straightforward: 

j' vL(u)dx = J vippuYdx — J vqvdx = vpu — Jv'pu'dx — J vqudx 


The first term on the right vanishes because of (28) ; the second may be 
transformed by another partial integration into —v'pu -f J" uipv'Ydx^ 
of which the first vanishes also. But the remaining integral, 



— uqv]dXj is nothing other than 


/ 


uL(v)dx, 


The result (29) 


is often expressed by saying that the operator L is Hermitian with respect 
to functions satisfying condition (28). The importance of Hermitian 
operators will be more evident in the next chapter. 
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8.6. Variational Aspects of the Eigenvalue Problem.^ ^ — Before pro- 
ceeding further, the reader is advised to review the main points of Chapter 6. 
It will be shown that the Sturm-Liouville equation (26) is the Euler con- 
dition which the function u must satisfy in order that (1) the integral^ ^ 

J (pu^ + qu^)dx = A(u) (8-30) 

take on a stationary value, (2) the function u be normalized: 

J* vm^dx = 1 (8-31) 


The proof is simple. In the notation of sec. 6.5 we have, on writing 
Xi = ~X for convenience, 

K = I — \Ii = pu^ + qu^ — \wu^ 


and the Euler equation (6-15) is 

dK 

du 


dx du 


0 


(8-32) 


This is clearly identical with (26). The eigenvalue X here plays the role of 
a Lagrangian multiplier. We have thus seen that the process of solving 
the Sturm-Liouville equation is tantamount to a search for those functions 
u(x) which maximize or minimize A, subject to condition (31). This con- 
dition is important, for the integral A has usually only a single stationary 
value; but when eq. (31) is imposed A has numerous values each of which 
is stationary for a given neighborhood of functions u(x)j although of course 
only one of them is an absolute minimum or maximum. 

Example. Let us see whether the procedure here outlined will actually 
lead to a simple type of function defined by a Sturm-Liouville equation, say 
the Legendre polynomial. We start by assuming 

u - a + bx + cx^ 

with a, b, and c unknown. From Table 1 we see that p = 1 — hence 
A J* (1 — x^)(b^ + 4bcx + 4:C^x^)dx = |-b^ + 

We require that 

J u^dx = 2a2 + 1(62 + 2ac) + f = 1 

^ The development in this and the following sections leans heavily on Courant- 
Hilbert, “ Methoden der Mathernatischen Physik,” Vol. I, Second Edition. 

Henceforth in this chapter limits of integration will not be indicated when the 
range is from a to h. 
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Thus it is necessary to minimize 

452 + ^^2 ^ ^ 2 ac) + fc^] 

by choice of a, 6, and c. On putting the partial derivatives with respect to 
a, bf and c equal to zero and finally rewriting the normalization condition, 
four equations are obtained for the determination of the quantities a, 6, c, 
andX: 


0 

11 

+ 

(1) 

6(2 - X) = 0 

(2) 

c(8 - 3X) - 5aX = 0 

( 3 ) 

a? + + 2ac) ^ ~ ^ 

( 4 ) 


Suppose we put c = 0. Then, according to (1) and (3), aX = 0, while (4) 
yields a relation between a and b. Hence we can put either a = 0 or 
X = 0. In the latter instance, i.e., if 

X = 0 

we get from (2) : 6=0, and from (4) a = In the former instance, 

namely a = 0, (2) yields 

X = 2 

and (4) gives 6 = V^. 

Now instead of assuming c = 0, let us take 6=0. Consistency then 
requires that neither a nor c nor X can be zero. Hence we find from (1): 
c = —Sa, and from (3) and (4): 

X = 6 

and a = V^. We have thus determined altogether three solutions, corre- 
sponding to three possible values of X: 

X u 

0 V| 

2 n/|x 

6 \/|(l - 3x2) 

The reader will notice that the X^s are the first three values ofl{l + 1), and 
the u^s the first three normalized Legendre polynomials. We now return 
to eq. (26) in its general form. 

Let us assume for definiteness that the extremals of A are minima; the 
argument to be presented is equally valid when they are maxima. Also, 
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let Ui (x) be that function which produces the lowest minimum of A while 
satisfying (31), and let Xi be the eigenvalue corresponding to it. We now 
seek a function U 2 (x) which will also produce a minimum of A and satisfy 
(31), but which, in addition, shall be orthogonal to ui: 

f VMxU^dx = 0 (8-33) 


The Euler equation for is more complicated than that for ui since U 2 
must satisfy two accessory conditions and ui only one. In fact, 

K = pU2^ + qul — \2'^'^2 ~ IJLWU 1 U 2 

II being a new Lagrangian multiplier. Hence eq. (32) becomes 

2qu2 — 2\2WU2 ~ yiWUi — 2(p?4)^ = 0 

and this is identical with 


L‘{u2) + X2wn^2 + = 0 

To determine the value of g we multiply this equation by u\ and inte- 
grate, making use of relation (29) which, of course, we require u\ and U 2 
to obey. The result is 


J U 2 L(u\)dx 4” X 2 J \ou\U 2 dx + J wu\dx == 0 (8-34) 

Here the first term is — Xi J wu\U 2 dx because u\ satisfies (26), and this 


equals zero because of (33). For the latter reason, the second term of 
(34) also vanishes. But the integral appearing in the third term is cer- 
tainly finite. Hence we conclude that the multiplier g = 0; we might as 
well not have required relation (33): U 2 satisfies the same equation as ui, 
but for a different eigenvalue X2. Moreover, it is automatically orthogonal 
to Ui. 

This process may be continued. Suppose we seek a function ^3 which 
will minimize A, subject to the three conditions 



/ 


wuiUsdx 


I 


WU2Usdx 


= 0 


The minimum thus obtained will lie at least as high as that produced by ^2, 
for the choice of functions has been further restricted. The quantity K 
appearing in Euler^s equation now contains three undetermined parameters, 
X3, iJLy and p. The last two of these may be shown to vanish by a method 
similar to that above. By further extension of this process we are led to 
this result: If we desire a set of functions which (1) minimize A, (2) are 
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normalized, (3) are mutually orthogonal, they are found as solutions of 
eq. (26). 

Conversely, it is easy to show that all solutions of (26) belonging to 
different eigenvalues are orthogonal. To do this, one need only multiply 
two specific forms of (26) : 

L{ui) + = 0 , L(uj) + \jvmj = 0 

by Uj and Ui respectively, integrate each equation and subtract. When 
(29) is used, the result is simply 


(Xi 



iwiiUjdx = 0 


(8-35) 


Hence either X, = Xj, or Ui and uj arc orthogonal. 

The case in which — Xy, where two (or more) eigenfunctions belong 
to the same eigenvalue, is not of very great interest under the simple con- 
ditions we are here considering {real eigenfunctions, one independent vari- 
able). It is very much more important in the more general eigenvalue 
problems of Chapter 11. As to terminology; whenever several eigen- 
functions, i.e., linearly independent eigenfunctions, are associated with one 
eigenvalue, that eigenvalue is said to be degenerate. 

It may seem strange to find eq. (35) predicting orthogonality only for 
non-degenerate cases, while the variational argument of the preceding para- 
graphs implies no restriction of this sort. Harmony is restored when we 
realize that a set of linearly independent solutions of eq. (26) can always be 
combined in such a way as to form an equally numerous, equivalent set of 
orthogonal solutions. (Cf., for instance, the method of Schmidt, sec. 
10.8). Hence we may, if we like, speak of the orthogonality of all solu- 
tions of eq. (26), assuming tacitly that the process of orthogonalization 
has been carried out on all sets of functions belonging to a degenerate 
eigenvalue. 

Example. Degeneracy arises when in the vibrating string problem 
expressed by eq. (2) one replaces the ordinary boundary conditions (a) 
and (b), sec. 8.2, by one requiring only periodicity: 

S{a) = S{b) (8-36) 

The eigenvalue parameter, X, in this equation is k^. Moreover, it is to be 
noted that the periodicity condition (36) conforms to our general require- 
ment (28). The solution satisfying*(36) is easily seen to be 

, / 27rn \ 

S = A sin ( 5 H — j- xj 

The use of this method for functions instead of vectors is illustrated in Lindsay and 
Margenau, “ Foundations of Physics," John Wiley and Sons, p. 425. 
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where i = 6 — a, and 6 is arbitrary, the quantity k taking on the values 
2Tn/L But n may be a positive or a negative integer. Hence to the 
same value of namely 47r^n^/Z^, there correspond the two functions 



Except when 5 is an integral multiple of tt, as it must be when the ordinary 
boundary condition is imposed, Si and S 2 are linearly independent. Yet 
they are not orthogonal (except in the special case when 8 = 7r/4). It is 
easily seen, however, that if we put 

2 / 27rn \ 

I 2 r . / 2im \ . / 2m \1 

- Vta^) L“ ™ V + T - T *)J 

wherein s = sin^ 8 — cos^ 5, we have a pair of functions, satisfying the 
differential equation for the same k^, which are both orthogonal and normal. 

One further point is to be made in connection with the variational 
property of the solutions of the Sturm-Liouville equation. We have seen 
that the Ui minimize the integral A. What are the stationary values of A 
thus produced? Let us compute them. 



A(u») = J iini? + qv^)dx = j - J [u<(pw,')' - u<gw<]dx 

= — jf UiL{ui)dx = Xi wuidx = X,- (8-37) 


The simple and interesting answer is, then, that the stationary values of A 
are the eigenvalues X*. 

Problem. The integral A {u) for the differential equation u'' + = 0 is I {u')^dx. 

Jo 

Assume for u any normahzed polynomial containing the factors x and x — Z, and show 
that A computed for this u is greater than the lowest eigenvalue 

8.7. Distribution of High Eigenvalues. — Preceding considerations indi- 
cate no uniform law according to which the eigenvalues of any differential 
equation are arranged; regularity does, however, prevail for the high 
eigenvalues, as will now be shown. Let all X's be arranged in numerical 
order, so that Xq is the lowest. Although no proof of the existence of an 
infinite number of eigenvalues has here been given, their variational mean- 
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ing strongly suggests^^ and the examples confirm this expectation. We 
shall now prove the theorem : 

lim Xn = const, (8-38) 

, n— 

Under the substitutions 

Z = {jpwY^^U^ ^ ^ J \ / 


eq. (26) takes the form 


cFz 


— f{t)z + Xz = 0 


(8-39) 


Detailed consideration which may be left to the reader shows that the 
function /(<) is bounded. Now consider, in place of (39), the differential 
equation 


d^z 

1? 


"1“ X^z 


= 0 


(8-39') 


Its eigenvalues are the minima of A(z) 


of t at X = 6, namely r 




■xey* 


where r is the value 


dx. On the other hand, the eigen- 
values of (39) are the minima of A(z) == XT© + ] dt. Suppose 

that z = Zi minimizes a'. If this same zx were used in constructing A, the 
value of A would differ from the minimum of A' by / fz^dt Since zi is 

t/O 

not the proper function for minimizing A, the minimum of A will differ 


'dt 


< F where F is 


from that of A' by less than this integral. But I fz^ 

I *^0 

the absolute value of the maximum of the function /(O, because z is nor- 
malized. Hence we conclude that the eigenvalues of (39) differ from those 
of (39') by no more than F, a finite quantity. If the X' values tend to 
infinity, the X^s also do. 

But the eigenvalues of (39') are well known. They depend, of course, on 
the boundary conditions for z, and hence for u. If u vanishes at both a 


Suppose Un produces the minimum Xn. Of the function Un+i we require that it be 
orthogonal not only to the n — 1 functions with respect to which Un has this property, 
blit also to Un iteelf. Hence the class of functions from which must be chosen is 
more restricted, and the minimum produced by cannot lie below Xn. Now there is 
an infinite number of functions orthogonal to the set uq, • • • itn, and it is hard to be- 
lieve that they will all produce the same eigenvalue Xn. 
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and by so that z vanishes at 0 and r, the eigenvalues are 


x;: 


n^TT^ 




In case only periodicity of z is required (see example of preceding section), 
the eigenvalues are 

- 4n^7r^ 


In any case, 

= const, 

Since the high ” eigenvalues X approach the “ high '' values of X', theorem 
(38) is established. It is to be observed that our result in this particular 
form is conditional upon the assumption of definite t, which is usually equiva- 
lent to a finite range of x. Several of the equations listed in Table 1 are 
ordinarily treated for infinite ranges of the independent variable; for 
these, theorem (38) is not valid because r becomes co. Hermite’s equa- 
tion is a case in point: its eigenvalues are proportional to n rather than 
even asymptotically. But here, as well as in all other cases, it is still 
true that 

Xn as n — > 00 (8-40) 

It is interesting to note that the solutions of eq. (39') are asymptotically 
(for large X) equal to those of (39). Thus 

. . UTT 

lim Zn = An Sin — t 
r 


provided the boundary condition is: 2(0) = z{r) =0. In terms of u this 
reads 

to.. - ~[X’(;)"*][X”G)" *ri 

8.8. Completeness of Eigenfvmctions. — In sec. 8.2 there appeared a 
qualitative, though crude definition of completeness. We now wish to 
give that definition greater precision and to prove it under the conditions 
outlined in sec. 8.5. A system of functions ui, U 2 y • • • is complete if it is 
possible to “ approximate in the mean ” any function /(x), satisfying the 

n 

same boundary conditions as the m’s, by means of a series £ aflii, that is, if 



(8-41) 
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We are here concerned with functions u which are solutions of eq. (26); 
hence we know them to be orthogonal. This permits at once the determi- 
nation of the coefficients a,-. If, for any given, finite n, we wish to make the 
quantity 

N - 


as small as possible, then 


daj 


for j = 1, 2, • • 


n 


The differentiations may be carried out under the integral sign, so that 


whence 




Ujwdx = 0 



(8^2) 


This, then, is the best choice of coefficients with which we may hope to 
satisfy (41). 

Now introduce the following abbreviations 





We shall show that the function An/Cn has the following properties: 

(1) it is normalized, (2) it is orthogonal to every Ui up to and including 
Un* The first property is obvious; the second is easily seen as follows: 

7iiWdx = ~ I r fui7i)d.x — aj f UjU(mdx 

C,i UJ ; = 1 •/ 



— (at — ai) if i n 

Cn 

— (a,- — 0) if i > n 
Cn 


But if An/Cn has these two properties, it satisfies all the conditions which, in 
the variational procedure, we imposed upon except that of minimizing 
A. Hence it is clear that 
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and this means: 

i A(A„) ^ Xn+, (8-43) 

Cn 

The remainder of our argument consists in proving that A(A„) is finite. If 
the reader will accept this fact/® which is almost obvious from the meaning 
of A„, the last inequality leads at once to (41) ; for as n approaches infinity, 
the right-hand side tends to infinity in view of (40), hence 

lim = 0 

n— 

This is the same as (41). 


For the more exacting reader, we here indicate the proof. The integral A (An) 
may be transformed in accordance with the first three steps of (37) into 


nL(An)dx 


But 


because 

Hence 


A(A„) = -f^nl 
J' A„L(A„)dx = J' ^ ~ - '^aiUi)dx - J* fL(J)dx + 

L(ix») — — and J*uiL{f)dx - J*fL(ui)dz = —CiX* 


A(A„) = A(/) - f; ajx< 


»=1 


The existence of A(/) must be assumed, for otherwise an expansion of / in terms of the 
u*a may be impossible. Moreover, / and therefore the approximating function 

n 


<Pn 


n 

• ^ a{Ui must possess integrable squares. Let us suppose that 
i«l 


J* <plwdx - a< == Mn 


If we add zero in the form ]CaiXi — MnXi, where Xi is the lowest of all eigenvalues, to 
1 

the last expression for A (An) we obtain 

A(An) = Aif) - ZalCKi - Xi) ^ MnXi 

The difference A(/) — AfnXi is certainly finite for all n. Let us call it A. The summa- 
tion on the right consists of positive terms only. Inequality (43) may therefore cer- 
tainly be written 

A . 

2 = ^n+1 
Cn 

This forces Cn to becqme zero for large n since Xn+i tends to oo . 
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8.9. Further Comments and Generalizations. — In the last section we 
have shown, not that 

/ = (8-44) 

but rather that the series on the right approximates / in the mean in 
accordance with eq. (41). To put the difference more concretely : Eq. (41) 
may be true and yet (44) may not hold for all 'points of the range 
(a ^ X b). It is clear that if (44) is true almost everywhere but fails 
at a finite set of points, the contribution of these points to the integral in 
(41) would be nil and that equation would be true. To prove (44) in 

addition to (41) would involve the establishment of absolute and uniform 

00 

convergence of the scries For the solutions of eq. (26) with 

1 

boundary conditions of the type here chosen this can indeed be done,^^ and 
the reader need not be excessively concerned over the difference between 
** completeness ” (expressed by eq. 41) and the possibility of expansion of 
an arbitrary function (indicated by 44). 

The preceding theory has always involved the assumption of a finite 
range, 6-a, of the independent variable. This is clearly a serious limita- 
tion, for it excludes the usual solutions of a number of the equations listed 
in^Table 1. To develop a rigorous account of the situation arising when 
the^range is extended to infinity is not easy, but what happens qualitatively 
under such conditions can be readily seen. 

Consider again the vibrating string with eigenvalues - n^T^fl^, 
As I tends to infinity, these eigenvalues move closer together until in the 
limit they form a continuum. The eigenfunctions ’are still of the form 
A sin (kx + 5\ but they refuse to be normalized in the former sense; for 

clearly the integral J* sin^ kxdx, when taken over an infinite range, 

diverges. Also, since the eigenvalues are no longer discrete, our definition 
of orthogonality loses its sense. However, completeness is still guaranteed 
since what was originally a Fourier series will now become a Fourier inte- 
gral (cf. eq. 14). The difficulty concerning orthogonality and normaliza- 
tion can, however, be avoided by introducing eigendifferentials ” instead 
of eigenfunctions.^® 

The situation brought about by an extension of the range may be even 
more complicated than this. We shall see in Chapter 11 that the differen- 
tial equation describing the hydrogen atom (eq. 11-55), which is closely 

See Courant-Hilbert, p. 370. 

See, for instance, Kemble, E. C., The Fundamental Principles of Quantum 
Mechanics,” McGraw-Hill Book Co., 1937, p. 162 et seq. 
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related to Laguerre^s, admits, because of its infinite range, both a discrete 
and a continuous set of eigenvalues (“ spectrum This phenomenon is 
of very frequent occurrence. On the other hand, eigenvalues associated 
with a Sturm-Liouville problem of infinite range are not necessarily con- 
tinuous, as the example of the simple harmonic oscillator (cf. Chapter 11) 
or Hermite^s differential equation (eq. 2-62) clearly shows. 

No mention has thus far been made of the possibility that the solutions 
of the Sturm-Liouville equation may possess singularities in the range 
a < X ^ b. Troubles of this sort might have been circumvented by 
postulating that the function p appearing in eq. (27) be always of one 
sign and never zero, as is sometimes done in treatments of the eigenvalue 
problem. This, however, would have excluded some interesting cases from 
Table 1, notably Legendre^s equation which has (non-essential) singular 
points at X = ±1, and Hermite^s equation which has an essential singu- 
larity at 00 . Suffice it to say here that these matters, though of consider- 
able fundamental interest, occasion no modification of the conclusions here 
derived. Attention is given to them in Kemble^s book (loc. cit.) 

The solutions of eq. (26) have been assumed to be real functions 
throughout this section. If the functions, p, g, and w are real, this entails 
no loss in generality. For suppose that a complex function - X + iY 
were admitted as solution of the differential equation^ this would merely 
imply that both X and Y are real solutions belonging to the same eigen- 
value. Thus, whenever complex solutions arise and are compatible with 
the boundary conditions, we may at once conclude that the correspond- 
ing eigenvalue is degenerate. (In the complex scheme, both u and 
u* ^ X — iY are linearly independent solutions.) If now wo require as 
normalizing condition 

f u^uwdx == 1 


we are merely postulating that, in place of the usual normalization 
X^wdx = J* Y^wdx = 1^, 

J' X‘^wdx -f J* Y'^vxlx = 1 


shall hold. In other words, we are operating, in the complex scheme, with 
linear combinations of the real functions, and with a different normaliza- 
tion. Orthogonality, if defined by eq. (5*) instead of (5), reverts to its 
ordinary meaning, for 


J* UiU2Wdx = J* [X 1 X 2 


+ Y 1 Y 2 + iX\Y2 — iY \X^wdx = 0 


We may, for instance, write the solution of eq. (2) in the form S = Ae***. 
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is an immediate consequence of the fact that eigenfunctions belonging to 
different eigenvalues Xi and X 2 are orthogonal. Furthermore, if we require 
w* and u to be orthogonal, 


J u^wdx = J* (X^ — + 2iXY)wdx = J X^wdx — J Y^wdx = 0 


provided X and Y have been chosen orthogonal. Thus both X and Y are 
normalized to ^ when the complex formalism is used. In view of these 
simple facts the validity of the completeness proof remains intact 
for complex / and complex u] only formal changes are necessary. 
The Gi become complex, and completeness is defined by the relation 


lim I An^nWdx = 0. 

n — >00 tJ 


Complete revision of the theory is necessary when 


the coefficients p, q, w are permitted to be complex, but such cases are 
rarely of interest in physics and chemistry. 

Finally, it is appropriate to remark that our development has been 
restricted to one dimension. The Sturm-Liouville theory can be gener- 
alized without great difficulty to certain partial differential equations with 
much the same results. For this generalization we refer the reader to 
Courant-Hilbert. 



CHAPTER 9 

MECHANICS OF MOLECULES 


9.1. Introduction. — As an illustration of the mathematical methods 
used in mechanics, we discuss in this chapter an important physical and 
chemical problem, namely, the motion of a molecule containing n atoms. 
We limit ourselves to this single topic for several reasons: its complexity 
requires us to describe most of the mathematics used in mechanics; the 
same methods may be extended to other problems, for example, the 
motions of particles within the atomic nucleus or the motions of a macro- 
scopic body such as an aeroplane^ ; and finally because considerable interest 
is now being shown by chemists and physicists in the spectrum of the poly- 
atomic molecule. This chapter will also present an opportunity for deal- 
ing with the purely mathematical question of how to describe the configu- 
ration of a rigid body (Euler ^s angles, etc.), a matter which is of great 
generality and must be included in a survey of mathematical methods 
used in science. Many adequate accounts of classical mechanics^ exist so 
that we do no more here than recall briefly some of the principles of that 
subject before proceeding to the special problem in which we are interested. 

9.2. General Principles of Classical Mechanics. — A Jree particle is one 
whose motion is completely unrestricted. It is said to have three degrees 
of freedom^ for its position is uniquely determined at any instant by three 
independent coordinates. Consider a system containing n such particles, 
where the instantaneous position of the i-th particle of mass rrii is specified 
by the vector r,. If is the vector resultant of all the forces acting upon 
the particle then the motion of the system is described by Newton^ s equa- 
tions which may be written in the form 


dhi 


miii = F,-; {i = 1, 2, • • •, n) 


(^ 1 ) 


In many cases, the particles composing the system are not free but 
restricted. For example, a member of the system may be allowed to 

^ See Frazer, R. A., Duncan, W. J., and Collar, A. R., “ Elementary Matrices,” 
Cambridge University Press, 1938. 

^ Whittaker, E. T., “ Analytical Dynamics of Particles and Rigid Bodies,” Third 
Edition, Cambridge University Press, 1927; Ames, J. S., and Murnaghan, F. D., Theo- 
retical Mechanics,” Ginn and Co., 1929; Macmillan, W. D., “ Dynamics of Rigid 
Bodies,” McGraw-Hill Book Co., 1936. 
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move only on a surface, so that its degrees of freedom become two. Under 
such circumstances the equation of the surface is called the constraint. In a 
similar way if the particle is required to move along a line, there is only one 
degree of freedom and the two equations which define the line are the 
constraints. If the sum of the degrees of freedom of all the particles is 
k < 3n, then the system may be regarded as a collection of free particles 
subjected to 3n — fc independent constraints so that only k coordinates 
are needed to describe the motion of the system. These new coordinates 
Qh Q2 i ' * Qk are related to the Cartesian coordinates of the particles (cf. 
eqs. 5-1 and 5-2) ; they are called the generalized coordinates of Lagrange. 

If, for convenience, we let the Cartesian components of ri be Xi, X2, Xz] 
the components of r2 be X4, X5, and so on (remembering also that 
mi = m2 = m3; m4 = ms = me; etc.), then the kinetic energy T of the 
system is given by 

3n k k 

2r = Z = L L (9-2) 

t*l r=l8=l 


where 



dXi dXi 

= 2 . 

,= i dqr dq. 


(9-3) 


Since the components of momentum in Cartesian coordinates are 

dT 

Pi = miXi = — 

dXi 


we define, by analogy, the generalized momenta as 

dT * 

Pr(9i92 • ■ • ; iih •■•) = — = Z (9-4) 

dqr .=1 

In many physical problems, the system is conservative, that is, a poten- 
tial function (31,32, • • ’,3*) exists such that 

dV 

Qi = - — ; a = 1 , 2 , k) (9-5) 

dqi 


Then, as was shown in sec. 6.3 (cf. eq. 6-11), Lagrange’s equations of 
motion are 


d /m _ 

dt \d^i) 


dqi 


= Qi; (i = 1, 2, • • •, fc) 


(9-6) 


This is a set of k differential equations of second order with 31, 32, • • •, 3fc 
as dependent variables and t as independent variable. 

If we introduce the Lagrangian function 

LiqiAi) = TiqiAx) - V{qi) 


(9-7) 
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eq. (5) becomes 


d /^\ 
dt\diJ 



(i = 1, 2, . . k) 


(9-8) 


The solution of Lagrange’s equations in either form (5) or (8) will result 
in an expression for each generalized coordinate qi as a function of time and 
2k constants of integration. The latter must be determined from the 
initial conditions of the n particles of the system. 

It is often of advantage to transform (5) or (8) to a set of 2k first order 
differential equations. From (4), (8), and the definition L = T — 7, 
we have 


Pi = 



dL 



(9-9) 


We now define the Hamiltonian function 

k 

H ^ZPiQi- L (9~10) 

i»l 


Its total differential is 


k k k k 

dH + 22 i4Vi - "L — dq, - 'E. ~ d^i 

t = i »=i 


But by using (9), the first and last terms cancel, giving 

k ic 

dH =E QidPi -IL dq, ( 9 - 11 ) 

.=1 i^idqi 


This equation depends only on dpi and dg, but not on hence S’ is a 
function of q and p alone and we may write 


>= dH * a// 

dH =E— dPi + 'L — dqi 

i^idpi i=\dqi 


Comparison of (11) with (12) shows us that 


dpi dq. 


dqi 


-Pi, (i = 1, 2, 


•, k) 


(9-12) 


(9-13) 


The resulting first order differential equations (13), 2k in number, are 
Hamilton’s canonical equations of motion; pi and qi are said to be canoni- 
cally conjugate variables. 

Problem. Show that 2T = EvAi and H = T V. 

9.3. The Rigid Body in Classical Mechanics. — As a crude first approxi- 
mation to the motion of a molecule we consider a rigid body which is defined 
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as a system of n particles bound together by interior forces in such a way 
that the distance between the i-th and j-th particles is constant and 
unaffected by any external force to which the system is subjected. Sup- 
pose Xt, yiy Zi are the Cartesian coordinates of the i-th particle, then the 
distance between the i-th and j-th particle is 

Tij = V (xi - XjY + {Vi - ViY + ZjY = constant (9-14) 

j ~ ^) * * *> 

It is readily shown that the most general displacement of a body of this 
sort may be obtained in a variety of ways by a combination of translation 
and rotation about an axis fixed in the body. The choice of a reference 
point, that is, the origin of the vector which locates the fixed axis, is entirely 
arbitrary. For a given displacement, this point may be chosen in such a 
way that the translation is parallel to the axis of rotation. With this 
choice of reference point, each displacement can be effected in one and 
only one way, the resulting motion being similar to the displacement of a 
nut on a threaded screw. It is thus only necessary to consider trans- 
lation and rotation in order to study the most general motion of a rigid 
body. It should be remembered, however, that the axis of rotation may 
be continually changing its direction, hence we usually refer to an instan- 
taneous axis of rotation. 

9.4. Velocity, Angular Momentum, and Kinetic Energy. — Suppose a 
rigid body is rotating about an axis with a constant angular velocity < 0 ; 
then the linear velocity of any point P in the body is given by 

V = CD X r (9-15) 

where r is a radius vector drawn to P from a fixed point 0 on the axis of 
rotation (see eq. 4-16). If the point P has a mass m, its momentum is 

mv = m((D X r) (9-16) 

and its moment of momentum or angular momentum, (see sec. 4.5) about the 
point 0 is 

M = r X mv = m[r X ((0 X r)] (9-17) 

Suppose the fixed point 0 about which the body is rotating is taken as the 
axis of a Cartesian coordinate system OXYZy the components of cd are 
o3yy and the components of r are x, j/, e. Then in accordance with 
eq. (4-13), the components of v are: 

Vx *= ZOiy — yCD,* 

Vy = XCl)^ ~ Z(j0x (9-18) 

W* = t/a)x — XCl)y 
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and the components of M are: 

Mx = m(yvt “ zvy) 

My == m{zvx — xvg) (9-19) 

Mg = m{xvy — yvx) 

On combining (18) and (19) there results 

Mx = Ao)x — Eo)g 

My = jBcoy — Doig — Fo)x (9-20) 

^M^ g CjOig Fo)x F)0iy 

where A, B^C are moments of inertia and D, J?, F are products of inertia: 

A = m{y^ + 2^); F> = myz 
B = m(z^ + x'^); E = mzx (9-21) 

C = m{x^ + y^); F = mxy 
The kinetic energy T of the particle at P is given by 
2r = mv • (o) X r) = m[vcDr] = m[cDrv] 

= mci) • (r X v) = (0 • M (9-22) 

where we have used eqs. (4-17), (4-18), and (9-17). Thus, in view of 
(20) we find 

2T - Ao)l + Bo?y + Cco^ — 2Do)yO)g — 2Eo)zO)x — 2FcoxWy (9-23) 

9.6. The Eulerian Angles. — We digress here to give explicit relations 
useful for locating a point P in a rigid body. Six parameters are needed. 
Three of them will specify a fixed reference point in the body, which is not 
necessarily at the origin of the coordinate system as in the preceding dis- 
cussion. Two more parameters are required to define the position of a line 
fixed in the body and passing through the fixed point, while the sixth 
parameter defines a rotation of the body about this line. 

Suppose we attach a rigid framework O'X'Y'Z' to the body and 
denote the position of its origin relative to a coordinate system OXYZ 
fixed in space by xo, yo, zq. We will also suppose that we know the nine 
direction cosines aij of O'X'Y'Z' relative to OXYZ, The point P may 
then be located in either coordinate system at will for we have the relations 
(see sec. 4.1) 

X = Xo + aiix' + ai2y' + ai^z^ 

y = 2/0 + + ^222/' + ^232' 

0 = 2o + aaix' + a 322 /' + 0330' 
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where x, y, z refer to OXYZ and x\ y', J refer to Only six 

parameters are required; the appearance of twelve is due to the fact that 
the nine direction cosines are not linearly independent. Three of the 
parameters needed are chosen as xq, while the remaining three, which 
are functions of the a^y, may be taken in a variety of ways. The Eulerian 
angles prove to be suitable for our purpose,^ these being defined as follows. 

Let the fixed system OXYZ and the rotating system OX^Y'Z' have the 
same origin as shown in Fig. 1. With OZ vertical, draw OL which is the 



projection of OZ^ in the XY plane. Direct it toward the bottom of the 
page as shown in Fig. 2. Draw OK perpendicular to the plane ZOZ^ 
directing it toward the reader's right. This axis is called the line of 
nodes; it is the intersection of the planes OXY and OX'Y\ In the plane 
ZOZ'j OL makes an angle of 90° with OZ \^ile OM is at an angle of 90° to 
OZ^ These relations are all clearly shown in Fig. 2 where we give the 
ZYj ZLj and Z'F' planes. The letters in parenthesis are the axes which 
are perpendicular to the plane of the page. The three Eulerian angles'^ 

® Other quantities which are used to describe the position of a rigid body are the 
parameters of Rodrigues and the Cayley-Klein parameters; see Whittaker, loc. cit. 

^ The Eulerian angles are often defined in other ways. We follow Whittaker (loc. 
cit.); in his notation, a — <t), 0 = d, y = t//. 
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are taken as 

YOK - a; Z'OZ = p: Y'OK « 7 

From Fig. 2, it is seen that OX'Y'Z' may be superimposed on OXYZ by 
the following operations: (1) rotate about OZ by the angle a; (2) rotate 
through /3 about OK^ which will bring OZ into coincidence with OZ'; 
( 3 ) rotate about OZ' by 7 which brings OX to OX' and OY to OY'. These 
must be performed in the order given. The frame OX'Y'Z' may be 
brought back to OXYZ by performing the same operations in reverse 



Fio. 9-2 


order, the rotations being made in the opposite direction. Thus the roles 
of the two frames may be interchanged by replacing a, jS, 7 by -7, -a 
in that order. 

To find the relation between OXYZ and OX'Y'Z' we see from Fig. 2 
that the direction cosines of OX' , OY' ^ OZ' with respect to OX are equal 
to a unit vector constructed along OX and projected on OX' ^ OY' ^ OZ' , 
This vector has length cos a along OL and —sin a along OK. Now cos a 
along OL becomes cos a sin /3 when projected along OZ' and cos a. cos 
when projected along OM. The latter becomes cos a cos ^ cos 7 along 
ox' and —cos a cos /3 sin 7 along OF'. Similarly —sin a along OK 
becomes —sin a sin 7 on OX'. and —sin a cos 7 when projected on OF'. 
Proceeding in this way, we obtain the results in Table 1 for the direction 
cosines of the two sets of coordinate axes. If a body moves in such a way 
that a, / 3 , 7 remain constant, the motion is a translation; if xq, 2/0, zq are 
constant, it is a rotation. 


TABLE 1 


ox 

OY 

OZ 

cos a cos cos 7 

sin a cos /3 cos 7 

—sin cos 7 

—sin a sin 7 

-fcos ot sin 7 


—cos a cos ^ sin 7 

—sin a cos sin 7 

sin ^ sin 7 

—sin ot cos 7 

4-cos Of cos 7 


cos a sin 

sin Of sin 

cos 
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In order to obtain the angular velocity in terms of the Eulerian angles, 
let us suppose that its components along OZ, OK and OZ' are a, /3, 7 . 
Now can be resolved into /§ sin 7 along OX' and p cos 7 along 07', while 
CL becomes -a. sin /3 cos 7 along OX', a sin 0 sin 7 along OF' and a cos 0 
along OZ'. We thus obtain 

a>x = — sin cos 7 a + sin y0 

o)y = sin sin ya + cos y0 (9-24) 

— cos 0a y 

for the three components of the angular velocity about the axes OX', OF', 
and 0Z\ In terms of the Eulerian angles, the kinetic energy of a rotating 

symmetric top {A = B), which we shall need later, is seen from eq. (23) 

to be 

T = ^[A0" + Aa^ sin^ + 0(7 + a cos 0)^] (9-25) 

provided we choase axes in such a way that I) = E — F = 0. 

Problem a. Finish the calculations leading to Table 1. 

Problem b. Complete the calculation of eq. (9-24). 

Problem c. Verify eq. (9-25). 



9.6. Absolute and Relative Velocity. — We now return to a more 
general consideration of the motion of a rigid body. Suppose a point P in 
it is located, as shown in Fig. 3, relative to OXYZ by the vector Tq and 
relative to O'X'Y'Z' by the vector r. Let the instantaneous position of 
the origin of O'X'Y'Z' be measured relative to OXFZ by r'.® Then the 
absolute position of P is given by 

To = r' + r 

* Note that primes in this chapter never mean differentiation. 


(»- 26 ) 
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and its absolute velocity by 

Vo = v' + v" (9-27) 

where v' = dr' /dt measures the velocity of the origin of O'X'Y'Z' relative 
to OXYZ and v" is the velocity of the point in the moving system. Now 
suppose that the latter system is rotating with constant angular velocity of 
CO radians per second; then the point P has a linear velocity <o X r in 
addition to its translational velocity v relative to O'X'Y'Z'. Its com- 
ponents are = dr^/dt = f^; Vy = fy) Vg = fg. Thus, 

Vo=v' + o)Xr + v (9-28) 

It is important to have a clear understanding of the separate terms in 
(28). The absolute velocity of the point is Vq; v is the apparent velocity 
of P measured by an observer in the system Y'Z' who does not know 
that his coordinate axes are rotating, while o) X r is the absolute velocity 
which the terminus of r must have in order to maintain its position in the 
moving body. The last velocity is often called the velocity of following. 
If the point P is rigidly attached to the moving system, v = 0; if the mov- 
ing system and the fixed system have coincident origins, v' = 0. 

9.7. Motion of a Molecule. — In a molecule, we may consider the elec- 
trons and nuclei as bound together in a rigid framework which moves 
through space in translational motion and which rotates around its center 
of gravity. Both of these types of motion are included in the equations 
already given. One further motion is needed, however, for the nuclei 
execute oscillations around an equilibrium position. In order to allow for 
this vibrational motion, let r* be the instantaneous position vector of the 
i-th particle and a,-, p,- be the equilibrium and displacement vectors, respec- 
tively, so that 

r» = ai + pt (9-29) 

while 

roi = r' + r,- (9-30) 

is the instantaneous position of the point relative to OXYZ as shown in 
Fig. 4. 

Then from (28) 

Vo< = v' + (® X r<) + Vi (9-31) 

and 

2T = '^m^i = v'^’^rrii + + 2w»<(® X r,) • (® X r,) 

+ 2v' • + 2£m<r,- • (v' X ®) + 2® • ^(ntiti X v.) 

(9-32) 
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The reason for writing the last two terms of (32) as given comes from 
eq. (4-18) since 

v' • (w X r) = r • (v' X a>) ; v • (w X r) = « • (r X v) 



Six further relations are needed to define the rotating coordinate system. 
These® may conveniently be taken as 

= 0 (9-33) 

Sm,a, X V,- «= 0 (9-34) 

The first three of these equations locate the origin of O'X'Y'Z' at the 
center of gravity of the system, for that point is given by 

_ Lw,r, 

^ Hriii 

and if f = 0, 2m, r,- = 0 and = 0. The second condition, eq. (34), 

states that there is no angular momentum relative to O'X'Y'Z' , when all 
particles occupy their equilibrium positions, i.e., when every r,- = a,-. 

Using (29), (33), and (34), eq. (32) becomes 

2T = X r.) • (» X r,) + 2«> • 2(w^.1>.- X v,) 

= 2(Tt + T, + Tr + Ti^t) (9-35) 

Inspection of (35) shows that the kinetic energy is a sum of four terms which 
may be interpreted in order as due to the translational motion of the mole- 
cule as a whole through space the vibrational motion of the nuclei 

* See Eckart, Phys. Rev. 47, 652 (1936); Sayvetz, J. Chem. Phya. 7. 383 (1939). 
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about an equilibrium position (T^)] the rotation of the molecule as a rigid 
body about its center of gravity (Tr); interaction between vibration and 
rotation (Tint). 

9.8. The Kinetic Energy of a Molecule. — It is necessary to obtain (35) 
in explicit form before further calculations can be made. As shown previ- 
ously Tf becomes equal to (23), but it must be remembered that A, 5, 
’ • F are instantaneous moments and products of inertia relative to the 
moving axes. They are not constants but functions of the position of the 
atoms and they change as the molecule vibrates. 

In discussing the terms and Tint, it is convenient to use normal coordi- 
nates (see sec. 10.17). Suppose p,- has components 
where, 

ft = HUkQk 

rix = iL^ikQk (9-36) 

= IL^ikQk 


and Itky Miky Title are constant coefficients such that 

^Llkxlkj “ ~ ^Li^^kxT^kj ~ 

k k k 

Then, 

= Z(|? + + h?) = rOi (9-37) 

Moreover, 

Z^i(pi X vOx = Z(»?it - = HXkQk 

Z^t(pt X v,)^ = - Ur) = ZVkQk (ft-38) 

Z^i(pi X yi)z = Z(ft^i “ ^ikx) = ^ZkQk 

where, 

Xk = ILi^ikTnii - mtkn^i)Qi 

Yk ILihkTiii — n^klii)Qi (9-39) 

i,l 

Zk = ILi'^ikUl — likTnii)Qi 

r,l 

Collecting terms, (35) finally appears as 
2Tt = 

2T, = ZQl 

2Tr — A (i)x A" Bo)y + Cwf — 2Do)xl>^y “ 2Eo)zO)y — 2Fo)x<*)z 

2T‘tni = 2o)x^XkQk + 2ui^YkQk + 2u)gJ^ZkQk 


(9-40) 
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9.9. The Hamiltonian Form of the Kinetic Energy. — In order to obtain 
the Hamiltonian form of (40), we must change from angular velocities to 
angular momenta. From (17) or (22) we see that the components of 
total angular momentum are 

ST 

Px == 7 — = Ao3z — Ditiy — FcOg + ^XjcQk 
dcjx 

dT 

Py — - = /lojx 4“ Bwy — Eo)z “b kQk (9-41) 

UOJy 

dT 

Pz = _ = — Eo3y -f Co)z -f J^ZlcQk 

d(*)z 

Similarly, the momenta conjugate to Qk are 
dT 

Pk '= — Qk 4~ y k^v 4“ (9-42) 

dQk 

Solving this equation for Qk and substituting in (41) gives 

Px = Ao)x — Doiy — Fo3z 4“ 'JlXkiVk — XkOix — y — Zk(*>z) (9-43) 

with similar expressions for Py and P*. The following abbreviations may 
be used to simplify the final results. 

A' = A - M: D' = D + TXkYk 

B' = B- ^Yl; E' =E+ ^Y,Z, ( 9 - 44 ) 

c' = c - T.ZI; P' = P A- j:ZkXt 

In terms of them, we may write 

Px — A'oix ~ F'o)z 4 " JlXkPk 

Py^~ P'w, 4- 4- Znpfc (9-43a) 

Pz = —F'<a)x ~ E'oiy 4 “ C'u)z 4 " ^ZkPk 

If we also write 

Px = ILXkPk] Pv ^HykPk) Pz = T.ZkPk (9-45) 

(43a) may be further simplified to read 

Px — Px -y A'oJx — D^OJy — F'o)z 

Py — Pv D'o)x 4“ B^Oiy — E'oJz (9-46) 

Pz — Pz F'u)x — E'o)y 4" C^i»3z 

The quantities p*, Py, Pz arise from vibration alone as may be seen from 
their definition, eq. (45); they are called components of internal angular 
momentum. 
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Adding together all the terms of (40) and using (41), (42) and (45), 
we obtain 

2T = 2Tt + (Px - Px)o>x + (Py - Vy)(»y 

+ (Pz — P*)«i + (9“A7) 

Finally, we find by solving (46) for the w’s that 

Wi “ 2/> 

Mtj = Hi) = (Pj ~ Pi) (9-48) 

With the use of these variables, eq. (47) takes the more elegant form 

2T = 2Tt + + ZpI (9-49) 

Explicitly the are: 



B'C 

- £'2 


A'C 

— f'^ 

l^xx 

1 


llyy — 

1 



A'B' 

- Z)'2 


C'D' 

+ E'F' 

t^zz 


A 

MlV = 


A 


D'E' 

+ B'F' 


A'E' + D'F‘ 

fJ'XZ 


A 



A 



A' 

-D' 

-F' 



A = 

-D' 

B' 

-E' 




-F' 

-E' 

C 



9.10. The Vibrational Energy of a Molecule.^ — The first term in (47), 
the translational energy, is of little interest in physical problems. We 
shall have no more to say about it. The only other term of that equation 
which can be treated further by classical mechanics is the last one, corre- 
sponding to the vibrational energy of the molecule. We first consider the 
potential energy of the system due to the vibration of the particles. It will 
be some function of the mutual positions of the nuclei and it is most con- 
venient to specify these in terms of the mass-adjusted components of a 
displacement vector. We formerly took these as 

(see eq. 36), 3n in number. Following convention we now use gi, q 2 y • • •, 
q^n for the same coordinates. If the system is placed originally in the 
equilibrium configuration (all qi = 0) and if the particles have very small 

^ This section as well as secs.'Q.ll and 9.12 makes use of some of the results of Chap* 
ter 10. It should be omitted or postponed by readers not familiar with orthogonal 
transformations. The authors suggest that the reader, rather than endeavor to under- 
stand normal coordinates by ‘‘elementary considerations,” acquaint himself with the 
more powerful methods of the next chapter. 
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initial velocities, we assume that they will never depart to any large dis- 
tance from that configuration, nor will they ever acquire large velocities. 
Under these conditions, we may develop the potential energy V by Taylor^s 
theorem in terms of ascending powers of the gf. 

/d K\ / \ 

■ -, 53 ,) -Vc + Z {-J + iZ (— ) «, + ■ ■ . ( 9 - 51 ) 

The constant term Vq which is independent of the qi can be omitted since 
it has no effect on the equations of motion of the system. The term linear 
in the Qi must also vanish since dV/dqi = 0 is the condition for equilibrium. 
Finally if we omit all terms be 3 ^ond the third, we obtain as an approxima- 
tion to the vibrational potential energy 

2V = ’EhiQiqj (9-52) 

where hij = (d^V/dqidqj). From (37) we have, in terms of the coordi- 
nates qi 

2T = Zit (9-53) 

where T is now written for the former T^. 

If we now subject both T and V to an orthogonal transformation (see 
sec. 10.17), we obtain 

2T = j:Qh 2V = (g-54) 

where the normal coordinates Qk are related to the q^s by 

qi = HocikQk (9-55) 

The constants Xjt are the 3n eigenvalues found from the characteristic 
equation 

1 - bij 1 = 0 (9-56) 

and aik is the matrix formed from the eigenvectors. 

Knowing T and V we may obtain the motion of the molecule by solving 
Lagrange^s equations (8). They appear as 



or 

Qk = -^kQk (9-57) 

Three different possibilities arise: (a) Xjt > 0; (b) X* = 0; (c) X* < 0. 
a. X& > 0. The solutions of (57) are 

Qk = Ak cos {V\kt + Sk); (k = 1, 2, ■■ •, 3n) (9-58) 
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This is the equation of simple harmonic motion with two constants of inte- 
gration; Ak is the am'plitude and 5^, the phase constant. Eq. (55) now 
reads 

qi = YL^ikAk cos + h) (9-59) 

k 

If all of the Ak are zero except one, say Ai, then all of the nuclei are acting 
as simple harmonic oscillators with a frequency of 

about their equilibrium position. Each nucleus has the same phase 
constant and reaches its equilibrium position at the same time. The ampli- 
tudes will vary because of the factor a^k> Such a motion is called a normal 
mode of vibration. Actually the situation is much more complex, for many 
of the Ak will be different from zero. Thus the motion of the nuclei con- 
sists of a superposition of all the normal modes of vibration, each with its 
own frequency V^/27r and amplitude. 

It frequently happens that some of the \k will be equal to each other 
in pairs or threes. This phenomenon, called double or triple degeneracy,^ 
means that two or three equivalent motions of the molecule have the same 
frequency and differ only with respect to their orientation in space. The' 
phase factors and amplitudes must be evaluated from the initial positions 
and velocities of the n nuclei. We show in the next section how the normal 
modes and coordinates may be determined for a specific example. 

b. \k = 0- The solution of (57) is 

Qk = Akt + 

hence the resulting motion is not a vibration. The nuclei will not oscillate 
about the equilibrium position but will continually move away. Since the 
whole treatment of the problem is based upon small oscillations from the 
equilibrium position, we are no longer justified in this case in omitting 
higher terms in the potential energy, and the method fails. Actually, it 
will be found that six of the \k vanish in the molecular problem (five if the 
equilibrium arrangement of the nuclei is linear). Three of these zero 
frequencies may be associated with translation of the molecule along three 
mutually perpendicular axes and the remaining three with rotation about 
the same axes. When it is desired, the zero frequencies may be removed 
from the problem before solving (56). This is done by reducing the 
number Of coordinates from 3n to 3n — 6, the equations of conservation 

^ For a discussion of this effecd, see Dennison, Rev. Mod. Phys.^ 3 , 280 (1931), or 
Wu, Ta-You, “ Vibrational Spectra and Structure of Polyatomic Molecules,” National 
University of Peking, 'Kun-Ming, China, 1939 
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of linear and angular momentAim (eqs. 33 and 34) being used for that 
purpose. 

c. Xfc < 0. The solution becomes imaginary and again does not corre- 
spond to a vibration. This case never occurs if the potential energy is a 
positive definite quadratic form (see sec. 10.12) which is always true in the 
molecular problem. 

9.11. Vibrations of a Linear Triatomic Molecule. — As an example of 
the preceding theory, we consider a linear symmetrical triatomic molecule 
XYo such as carbon dioxide. Let the central atom X have a mass vi 2 
and the two end particles have mass Let the equilibrium positions 
be and for Y and for X. In order to simplify the problem, we arbi- 
trarily assume that the only motion which the nuclei can make is along the 
line adjoining them, hence the displaced positions are = x? + 8 Xi. If 
we now take the potential energy^ as proportional to the square of the rela- 
tive displacements of the particles, in accordance with eq. (51) we have 

2y = k{ (5xi - 6 x 2 )^ + (6x2 - 6x3)^} (9-60) 

and 

2T = mi (8x1 + ^^3) + ^26x2 
In terms of mass adjusted coordinates g, = \^ibx^ 

2T = 



C-omparison with (52), shows us that 

bii = k/mi; bi 2 = 621 = -k/Vmim 2 ; 613 = b^i =0 

622 == 2 k f 7712 ] -^23 “ ^32 “ —k ni\m2\ 633 = k/irii 

When these values are substituted in (56) and the determinantal equation 
is solved we obtain 

Xi = k/mi] X2 = k^l] X3 = 0 

M = (2mi -f m2)/mim2 (9-62) 

In order to find the coefficients aik of eq. (55) which relate the qi to the 
normal coordinates Qt it is necessary to find the transformation which 
reduces T and T' simultaneously to a sum of squares (see sec. 10.17). 
According to sec. 10.15 the matrix effecting this transformation has as its 
columns the eigenvectors of the matrix [6jy], and these eigenvectors are the 

® See Wu, loc. cit., for remarks concerning the choice of the potential energy in 
special cases. 
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solutions (xi,a:2,X3) of the equations 

= \Xi 

corresponding to the three eigenvalues X* already found. Simple compu- 
tation yields for these eigenvectors 


[-2:3, 0, X3] 

II 



mi 

CO 

1 1 

\mx 

— ^3, ^3 
m 2 J 

\ — kfjL 

1 

\m2 

II 

0 

_^ 3 , ^ 

— ^3, ^3 

Ml 

1 


They are already orthogonal; when Xs is also fixed by normalization, i.e., 
by equating the sum of the squares of the components of each vector to 
unity, they may be compounded to give 


Oiik = 


-1/V2 

0 

l/\/2 


l/\/2iJLmi 

— 2/V2jLtm2 


l/VlJL7n2 


(9-63) 


We can now find the normal modes of vibration from (59). Taking 
Xfc = Xi, we see that the two end atoms move in opposite directions while 
the central atom is stationary. The other normal modes are found in the 


Oi 


• < MO 

Ai 


4P#.. .. Oi » O i» » 

A2 Xa 

Fig. 9-5 

same way. They are shown in Fig. 5. It will be observed that for the 
zero frequency^^ X3, the motion is translational, since xx = nix^^^qi = 
mx^^'^(xi^{A2,t +63) = (2mi +7/12) + ^3), and ^2, 0:3 also equal 

this expression. 

This frequency could have been removed from the problem by applying the con- 
dition mi (5x1 + SiCa) -f- m25x2 = 0 to (60). 
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The treatment of this molecule is not complete because of the artificial 
assumption that the motion is only along the line of nuclei. For a com- 
plete treatment the reader is referred to Dennison (loc. cit.) or Wu (loc. 
cit.). 

9.12. Quantum Mechanical Hamiltonian. — Lack of space forbids the 
transcription of the results thus far obtained into the quantum mechanical 
language of Chapter 11. To provide a general view, however, we shall 
append here a few comments indicating the line of attack to be taken on the 
problem of the polyatomic molecule from the quantum point of view. The 
material of this section is not needed in other parts of this book. The 
expression for the classical kinetic energy found in (49) contains momenta 
Pa((^ — V) z) defined in eq. (45) and pk defined in eq. (42). Both of 
these are conjugate to the normal coordinates Qk. On the other hand the 
momenta Pa of (43) are not conjugate to Q/t* In order to obtain a suitable 
expression for use in quantum mechanical calculations, all of the coordi- 
nates and momenta must be conjugate to each other. It is true that the 
Pa which are functions of the angular velocities could be written in terms 
of some set of coordinates such as the Eulerian angles and then the Eulerian 
angles a, 13 , 7 , the normal coordinates and the conjugate momenta p^r, p^, p^, 
Pa and Pk would be appropriate. The coordinates used in (49) may be 
retained, however, as shown by several authors. The correct quantum 
mechanical Hamiltonian^^ is 

H = - Vb) 

a,b 

+ + y (9-64) 

k 

where a, h denote x, p, 2 and /x is the determinant of txah (cf. eq. 50). This 
expression may be simplified by noting that Pa commutes with p* and that 
the flab are functions only of the Qk. We thus obtain 

H = ^ZPabPaPb - ILhaPa + 

a,b 

+ + V (9-65) 

where 

ha = |Z2Ma6P6 + PbPab + (9-66) 

b 

and pb does not commute with the fis. 

For the sake of greater generality, we no longer need confine ourselves 
to the potential energy expression previously used but write the most 
general function consistent with the symmetry of the molecule 

y = Fo + Fi + ^2 + • • • 

See Wilson, E. B. and Howard, J. B., J. Chem. Phys. 4, 260 (1936) or Dennison, 
D. M., and Darling, B. T., Phys. Rev. 67, 128 (1940). 
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The first term, Fq, is identical with that given in (52) or (54); Fi is homo- 
geneous in the third powers of the normal coordinates and their cross- 
products; V 2 is of the fourth power, etc. When the Hamiltonian is 
expanded, it is found that it can be divided into terms of different orders 
as follows : 

// = //o + Hi + //2 + • • • (9-67) 

The explicit form of Ho is 

fM2 p2 p2i 

2^0 = + f" + f1 + + ^0 (9-68) 

1^0 Ho Coj 

where A©, Bo, Co are the equilibrium moments of inertia. It is seen that 
this represents the sum of the Hamiltonians of a rigid rotator and a har- 
monic oscillator; hence this part of H may be treated exactly by the 
methods of quantum mechanics as outlined in Chapter 11. The higher 
terms in the Hamiltonian, however, must be handled by perturbation 
methods. 

^^The method of doing this is discussed in detail by Nielsen, H. H., Phys. Rev. 60, 
794, (1941). 
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MATRICES AND MATRIX ALGEBRA 

In ordinary arithmetic, attention is focused upon single numbers. 
These numbers may be combined by various operations, such as addition, 
subtraction, multiplication and so on, to yield new numbers. In many 
])ranch('s of algebra, th(' student is forced to confer interest, not upon 
single niiiiibers, but on collections of numbers (or functions). These col- 
lections can be simple seciuences like • • •, (in, in which the order of 

the indiA’iduals may, or may not be of importance. A vector is an example 
of this kind. Wlieu such a sequence is written down, no understanding 
prevails that the num])ers arc to be combined in a certain wa}^; it is the 
collection itself which matters. Meaning is imparted to the collection by 
specifying how it is to be combined with other collections. 

Besides simple seciiiences, collections of two-dimensional character are 
often objects of interest in mathematics, and recently in physics and 
chemistry. They may have a great variety of forms; they may be tri- 
angular ^ as 

ai 

bi 62 

Cl ('2 C3 

or rectangular, as 

02 (Is • • * Cin 

61 62 63 • • • 6n 


C\ C2 C3 • • • e^ 

or quadratic, as 


ai 

a2 

«3 

04 

bi 

62 

63 

64 

Cl 

C2 

C3 

C4 

d\ 

^2 

ds 

d^ 


Of these, the rectangular and quadratic ones are of greatest value. 
Without further specification, they are simply arrays, devoid of meaning. 
But when rules are laid down, stating how they may be combined to form 
new arrays, they become objects of mathematical importance, such as 
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determinants and matrices. This is usually indicated by enclosing the array 
in bars or brackets of different form, bars being frequently used for determi- 
nants, brackets for matrices. It is also convenient to use a single letter 
for the individuals of a collection, and to distinguish the individuals of a 
simple or linear collection by single subscripts, those of a two-dimensional 
collection by two subscripts. 

10.1. Arrays. — A collection of real or complex quantities is called an 
array if it can be displayed in an orderly table of rows and columns. The 
individual members of the array are its elements. Each is equipped with a 
pair of indices^ the first one referring to the row and the second one to the 
column in which the element is located. For example, the element Apq 
will appear in the p-th row and the q-ih column. If the number of rows n 
equals the number of columns, the array is said to be square (or quadratic) 
and of order n; if there are n rows and m columns {n ^ m)^ the array is 
rectangular and of order (n X m). 

10.2. Determinants. — The most familiar type of array is the determi- 
nantj^ which always has an equal number of rows and columns. It will be 
written in one of the forms: 



^11 

Ai2 

Ai3 

' * • Ain 


A21 

A22 

A23 

• * • A2n 

det A = 1 A 1 = 

A31 

A32 

A33 

■ * * Asn 

1 

Ani 

An 2 

An 3 

Ann 


The value of the determinant is obtained by the following procedure. 
First, a total of n\ products is formed by taking one element from each row 
and column. Each product is then arranged so that the first subscripts 
of the elements are in their natural order 1, 2, • • *, n. Wlien this has 
been done, it will be found that the products may be separated into even 
and odd classes each containing n !/2 terms, as follows. In the even class, 
an even number of interchanges of the elements is required to bring the 
second subscripts into their natural order while in the odd class, an odd 
number of interchanges is needed. For example, A12A23A31 is in the even 
class while A12A21A33 is in the odd class. If a plus sign is affixed to the 
even products and a minus sign to the odd ones, the algebraic sum of the 
n! terms, by definition, is the value of the determinant. We may thus 
write 

1^1= l)^^lriA2r2 * * • 

^ Treatises on determinants include: Kowalewski, G., “ Einfiihrung in die Determi- 
nantentheorie, ” de Gruyter, Leipzig, 1909; Muir, T., “ Theory of Determinants,*' 
London, 1906-23. 
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where the summation is made over all permutations of ri r 2 , • • *, fn, and h 
is the number of interchanges required to restore the natural order. 

The following properties are direct consequences^ of this definition. In 
each statement, the word row may be replaced by the word column and 
the reverse. 

1. The value of a determinant vanishes, | A | =0, when: 

a. All elements of a row are zero. 

b. All elements of one row arc identical with, or multiples of, the 
corresponding elements of another row. 

2. The value of a determinant is unchanged, if : 

a. Rows and columns are interchanged. 

b. A linear combination of any number of rows is added to any 

n 

row; i.e., if Atj is replaced by J^CkAkj, j = 1, 2, • • •, n, provided the Ck are 

/fc=i 

fixed numbers. 

3. The value of a determinant changes sign if two rows are interchanged. 

4. If each element in any one row appears as the sum (or difference) of 
two or more quantities, the determinant may be written as a sum (or differ- 
ence) of two or more determinants of the same order. Thus if the order 
is two 


All dr fill Ai 2 dr fii2 


All 

Ai 2 


fill fil 2 


= 



rb 


A21 A22 

1 

A21 

A 22 


A2I A22 


5. If all elements of a row are multiplied by a constant factor, the value 
of the determinant is multiplied by the same factor. 

10.3. Minors and Cofactors. — The complementary minor of an ele- 
ment Apq is the determinant obtained by striking out the row and column 
in which Apq appears. The cofactor of Apq is (— 1)^'^® times its comple- 
mentary minor. It will be indicated by A^®. It follows from eq. (1) that 

i AikA^^ = i AkiA^^; (fc = 1, 2, . . n) (10-2) 

h A,fcA'^ = Z AkiA^^ = 0; j 9 ^ k (10-3) 

» = 1 t = 1 

for comparison with (2) shows that these equations are the expansion of a 
determinant whose fc-th and j-th columns are identical with the fc-th column 
of I A 1, and according to property 1-b of sec. 10.2, if two columns are 
identical, ] A | =0. Eq. (2), called the Laplace development^ is commonly 

2 Details of the proofs may be found in Kowalewski, loc. cit. 


lAl = 

However, 
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used for numerical evaluation of determinants, but if their order is larger 
than three or four, the number of terms and the labor involved is so great 
that other procedures are to be preferred. We describe one in sec. 13.27. 

10.4. Multiplication and Differentiation of Determinants. — If | A | 
and I B | are determinants of order n, the product | C | 

UMs| =!c| 

is a determinant of the same order. Its elements are given by one of the 
four equivalent (though not equal !) expressions 

Cij = L AikBkj or i; A^kBJk or X) AkiB^j or ^ A^iBjk (10-4) 

A:*=l k==l 

The proof for determinants of order two follows. Using the first form 
of (4) we obtain 

^ 11^11 + ^ 12^21 AxiB\2 + A 12 B 22 
A21B11 + A22B21 A21B12 + A22B22 

but according to property 4 of sec. 10.2, the product may also be written 


A\\B\i 

A\\B\2 

+ 

AiiRii 

A\\Bx 2 

A2iB\i 

A21B12 

A22B21 

A22B22 

A\2B2\ 

A 12^22 

+ 

A12B21 

A12B22 

A2\B\\ 

A21R12 

A 22^21 

A 22-622 



The first and last terms of this sum vanish, for if the constant factor A 1 1^21 
is removed from the first determinant its first row is identical with its 
second row. Removal of the constant term A12A22 from the last determi- 
nant leaves it with two identical columns. Constant factors may also be 
removed from the remaining determinants but they do not vanish. The 
result is 


C 1 = A11A22 


Bii Bi 2 
B21 B22 


+ -4.12^21 


^21 B22 
Bii Bi 2 


Referring to property 3 of sec. 10.2 we see that this becomes 

1 C I = (AiiA22 — A12A21) 

Finally we note that (A11A22 — ^12^21) is just the Laplace development 
of I A I so that we have shown the equivalence of the determinant | C | 
with the product | A | | R |. The proof with the other forms of (4) is 
similar. The method is also clearly applicable to determinants of higher 
order. 


Bii Bi 2 
B21 B22 
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From (2), the partial derivative of a determinant with respect to an 
element equals the cofactor: 


d I A 
dAik 




10.6. Preliminary Remarks on Matrices. — If two or more arrays may 
be combined in a certain way described in sec. 10.6, they are called ma- 
trices.^ We indicate them by 


A = = 


11 

^12 

A 13 

■ ■ ■ 

21 

A 22 

^23 

• * • A2m 

31 

A 32 

^33 

• • • Asm 

n 1 

An2 

AnS 

Anm 


Unlike determinants, matrices may be square or rectangular. Matrices 
of infinite order^ will not be discussed here. When a matrix contains only 
one row or column, it is called a vector. For a row vector^ we will write 

[x] = [xu X 2 y x^y • • •, Xn] (10-5a) 

in order to save space, we write a column vector as 

jxj = {xi, X2, X3, ■ ■ ■, x„] (10-5b) 


although its matrix form would be 

Xl 

X2 

X3 


A small letter u, v, • • •, z written without brace or bracket always means a 
column vector. Matrices with two or more rows or columns will be indi- 
cated by capital letters. 

The elements of a square matrix A may be written and evaluated as a 
determinant. If | A [ =0, the corresponding matrix A is called singular. 
Since determinants do not exist for rectangular (non-quadratic) arrays, all 

® For treatises on matrix theory, the following may be consulted : Wedderburn, 
J. H. M., “ Lectures on Matrices/' Am. Math. Soc. Colloquium Publications, Vol. 17, 
New York, 1934; Frazer, R. A., Duncan, W. J. and Collar, A. R., “ Elementary Matri- 
ces,” Cambridge University Press, 1938. 

^ For their properties, see Wintner, A., “ Spektraltheorie der unendlichen Matri- 
zen,” S. Hirzel, I.eipzig, 1929. 
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rectangular matrices, by definition, are singular. Suppose we formed 
determinants of all possible orders by taking successively 1,2, • • •, n rows 
and. columns of A. If at least one determinant of order r does not vanish 
and all determinants of order greater than r do vanish, A is said to be of 
rank r. Thus if il is singular and of order n, r < n; if non-singular, 
r - n. 

Problem. Show that the rank of the following matrix is two: 

“ill r 

2 2 3 -1 

0 0 1 -3 

_3 3 5 - 3 _, 

10.6. Combination of Matrices. — Two matrices A and B are equal if 
and only if they are identical. If = B, then Apq = Bpq for every p 
and q. 

The addition or subtraction of two matrices of order n gives a new 
matrix of the same order according to the following rule. If zb 5 = C, 
then Cpq = Apg zb Bpq. Addition and subtraction are both commutative 
and associative. 

A zh B = B db A] (A zb B) dz C = A -j- (zLB dz C) 

Multiplication of a matrix by a scalar quantity a is defined by 
aA = Q:[Aty] = [aAij] = Aa 

Two matrices A and B may be multiplied together in the order AB 
only when the number of columns in A equals the number of rows in B. 
Under this condition, the matrices are said to be conformable. If A is of 
order (n X h)^ B of order {h X rn)^ the product C is of order {n X m). 
Its elements® are given by 

n 

Cpg “ E (P ~ 1) 2, • • •, n’, <7 = 1, 2, • • •, ?/l) 

«=1 

AB = [Cij] = C (10-6) 

In general, AB 7 ^ BA, but when the order of multiplication is of no impor- 
tance, so that AB = BA, the two matrices are said to commute or to be 
permutable. The ordinary laws regarding distribution and association 

.apply, 

AiB + C)F -- ABF + ACF; {AB)C ^ A{BC) = ABC 

Provided A, B, x and y are properly conformable 
>i{x}={y}; [xM = [y] 

[x]{y} = a scalar; {x}[y]=5 (10-7) 

® Note that the law of matrix multiplication is identical with the first form of eq. (4) 
which defines the multiplication of determinants. 
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In the last case, B is a square matrix which has the same number of rows 
as {x} (or columns as [y]). Its rows (or columns) are proportional to each 
other. 

A given matrix may be divided into smaller matrices, the result being a 
partitioned matrix. For example, a square matrix of order three may be 
divided into four submatrices as shown. 


A = 


■ All 

Ai 2 

^13" 

r* 


A 21 

A 22 

A 23 

flu 

ai 2 1 

.Aai 

A 32 

A33_ 

1^21 

a22j 


where 


[All A 12 '] [Aiz 

= A A y ^12 = A 
^21 ^22 L^23 


021 = [ A 


31 


^12 
^22_ 

A32] 1 ^22 = A 33 


If B is a similar matrix and is similarly partitioned, then each submatrix 
Qij and bij may be treated as a single clement so that 

AB = C - ^12^21 ^11^12 + Ql2&22'j 

L^2lbll + a22b21 ^21^^12 + 022 ^ 22 } 

Finally, the elements of C are completely evaluated by the usual rules for 
matrix multiplication and addition. This is a valuable property and is 
used frequently. 

If A == [Akj] is a square matrix of order m and B = [Bp^] is a square 
matrix of order n, then the direct product 

AXB ^ [AfcyBpJ 

is a square matrix of order 7?w. The index pairs (k^p) and (j^q) refer to 
the row and column, respectively. A suitable convention for arranging 
the rows and columns consists in taking these pairs in such a way that 
(j,g) precedes {/jQ) if j< j' j q < q' or if j = j', q < q (dictionary order). 
If A, C are of order 711 and B, F of order 71 then 

(A X B)(C X F) = AC X BF 

is a matrix of order 7nn. The direct product of matrices has of course 
nothing to do with the cross product of vectors, for which the same symbol, 
X, is used. 

Problem a. Prove eq. (7). 

Problem b. Prove that (A X B){C X F) — AC X BF. 

10.7. Special Matrices. — When all the elements of a matrix are zero, 
the matrix is called iiull and indicated by O. For any matrix A, 

O + A = A; OA == AO = O 



10.7 


MATRICES AND MATRIX ALGEBRA 


294 


It should not be inferred, however, that the vanishing of a matrix product 
implies that either or both of the matrices multiplied together are the null 
matrix (cf. Problem, sec. 10.7). 

The unit matrix E has unity for elements along the main diagonal.® 
All other elements are zero. The matrix elements are conveniently 
symbolized by the Kronecker delta (cf. sec. 3.4) 


For every matrix. 


Upq 


|0; p q 

U; P = g 


EA AE ^ A 


If all matrix elements vanish except diagonal ones, the matrix is called 
diagonal. The general element of a diagonal matrix is thus of the form 
Dibij. All diagonal matrices commute with each other, for if D and D' 
are diagonal 

{DD')^k = D^bijDjbjk == DiDibik = (D'D)ijc 
j-i 

Conversely, if D is known to be diagonal and not a multiple of E and if it 
commutes with any matrix A, so that 

DA = AD 


then A must also be diagonal. We will sometimes write a diagonal matrix 
as 

D = diag (Z>i,/> 2 ,- • -yDn) 

The sum of the diagonal elements of a square matrix is called the trace 
(German Spur ”). 

Tr A = £ A„ 

The trace of the product of two or more matrices is independent of the 
order of multiplication. The proof is simple. 

Tr AB = ZiAB),, = = Tr BA 

t I j 

If A X B = C, Tr C = Tr A • Tr 5. 

The transposed matrix to A, indicated by if = [Aji] is formed from A 
by interchanging rows and columns. If A and B of (6) are transposed, A 
becomes of order {h X n) and B of order {m X h). They may be multi- 
plied together only in the order BA and the product C is of order {m X n), 

® The main or principal diagonal is tliat running from the upper left to the lower 
right of the array. 
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Thus when a matrix product is transposed, the sequence of the matrices 
forming the product must be reversed. This holds true for any number of 
factors 

F = ABCD X; F = X- • DCBA 

The matrix A = is the adjoint matrix.'^ Note that the adjoint is 
formed by first finding the cofactor A^^ of the element Apq in j A [ and 
then transposing the resulting matrix. From the properties of determi- 
nants, it follows that 

AA == AA = \a\E (10-8) 

hence if A is singular 

aA ^ AA =0 (10-9) 

However, the adjoint matrix exists even when A is singular. 

When is a non-singular square matrix, we may divide ^ by | .4 | to 
obtain a matrix which is the reciprocal of A. Only square matrices 
have reciprocals. 

X 

= y—, ; AA-^ = A-'^A = E ( 10 - 10 ) 

/x 

Suppose the matrices of (G) are scpiare and non-singular. Multiply both 
sides of the equation by B~‘^A~^ and then by C~^ in the order shown : 

B-^A^^ABC-^ = B-^A-^CC-^ 

Thus = B~^A~^. Reciprocation of a matrix product requires reversal 
of the order of the factors as in the case of the transposed matrix product. 
The rule holds for any number of factors. 

If the element/S of A are complex numbers, the complex conjugate of A is 
defined as A* = [A*]. Unlike the preceding case, if F = ABC * * • AT, 
F* = i4*S*C* - X^. 

The matrix formed by taking the complex conjugate of all the ele- 
ments and then transposing the matrix is called the associate matrix^^ 
A^ = (^*) = (A )*. If F = ABC; F^ = C^B^A\ 

At this point we have defined four important operations on a matrix A, 
These result in —A,Aj and A*. It is important to note that each of 
these operations has the reflexive property, so that when the operation is 
performed twice, the original matrix is reproduced: 

-i-A) = A- (a) = A; (A-^r^ = A; {A*)* = A 

^ This name seems to be in agreement with the usual mathematical convention. 
Writers on quantum mechanics frequently call that matrix adjoint which we later call 
associate. The reader should take care not to be confused by this situation. 

® This is the matrix called adjoint by writers on quantum mechanics. It is also 
called the Hermitian conjugate. 
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By combining these operations in all possible ways, the following 16 matri- 
ces may be derived from A : 

±.1, ±A, d:A-\ ±4*, ±(4)-!, ±(A*r\ ±A^, 

In certain cases, A may be identical with some other member of this set. 
Such matrices have been given special names. We shall have occasion to 
discuss the properties of most of them later, but for convenience we list 
them now in Table 1. We will have no need of the types: A = A~^ {invoU 
utary) and A = 

TABLE 1 


Relation 

Name of A 

Matrix Elements 

4 = / 

symmetric 

Apq — A qp 

A^ -A 

skew symmetric 

App ~ 6j Apq = “—Aqp 

A = 2“^ 

orthogonal 

cf. eq. (42) 

A = A* 

real 

Apq — Apq 

A = -A* 

pure imaginary 

Apq — tHpq] Bpq real 

A = A^ 

Hermitian 

Apq — Afp 

A = -A^ 

skew Hermitian 

Apq = ~~Aqp 

A = 

unitary 

cf. eq. (50) 


Note that a real symmetric matrix is a special case of an Hermitian 
matrix. Suppose H = A -\- iB is Hermitian with both A and B real, then 
W - A iB] but by definition H = W. Thus the real part is sym- 
metric and the imaginary part skew symmetric; in other words, a real 
Hermitian matrix is also symmetric. Similarly, a real orthogonal matrix is 
unitary, for if . £/ A + iB is unitary then by definition U = 

UW = E and {A - iB)(A + iB) ^ E. li B ^ B = O, then AA = E 
which defines the orthogonal matrix. However, a complex symmetric 
matrix is not Hermitian nor is a complex orthogonal matrix unitary. 

Problem, Show that AB = O but BA ^ O where 



”-6 

-4 

-2”| 

0 

1 

-2“ 

A = 

-9 

-6 

-3 ; R = 

-1 

0 

3 


3 

2 

ij 

2 

-3 

0 


10.8. Real Linear Vector Space. — Let us consider a space of two dimen- 
sions, that is, a plane in ordinary three-dimensional space. A vector in 
this space, as we have shown in sec. 4.1, is completely described by its two 
components or by the coordinates of its origin and terminus. It is also 
described by the matrix x of one column, or its transposed, the row vector 
[X] = X, the two real numbers which are its components being the two 
matrix elements. After we have chosen one vector it is possible to find 
another vector y in the same plane which is not a multiple of x. In fact, 
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y is completely independent of x. But no matter how we draw a third 
vector z, it may always be represented as 

ax + by = z 

where a and h are numbers. There is nothing unique about x and y, the 
point being that two and only two vectors are linearly independent in two 
dimensions and a third vector is linearly dependent on the other two. The 
situation may further be characterized as follows. If two vectors are 
linearly independent, no relation 

ax + by = 0 

can exist unless a = b = 0, for as we have seen a linear combination of two 
vectors gives a new vector. For the purposes of this chapter, we shall 
need more than two or three dimensions, hence we shall speak of a space of 
n dimensions, where n is an integer. When n is greater than three, it is, of 
course, impossible to visualize the situation, but the geometric concepts of 
ordinary space will be used wherever convenient. Thus an n-dimensional 
coordinate system will consist of n mutually perpendicular axes, a point 
will require n coordinates for its location and a vector will be described 
by means of its n components or by the coordinates of its origin and 
terminus. 

Suppose the components of a vector in such a space are real numbers 
0 ^ 1 , X 2 , • • *, Xn^ then we may write the vector x as a matrix of either a single 
row or a single column as in (5a) or (5b). 

The scalar product of two vectors'*^ is a scalar 

^ + ^ 22/2 + • • • + XyiVn ( 10 - 11 ) 

The square of the length of a vector is defined as in sec. 4.1 

1“^ = itx. = y? = x\ + xl-\- xl ( 10 - 12 ) 

The vectors Ui, U 2 , • • *, Un are linearly independent if there exists no set of 
scalar quantities ci, C 2 , • * •, Cn not all zero such that 

CiUi + C 2 U 2 + • • • + CnUn == 0 (10-13) 

linear independence is to evaluate 

••• UlUn 
U2Un 

• • • U^Un 

® For definiteness, we suppose that x and y are both column vectors. Note that ry 
is the equivalent, in matrix notation, of x • y in vector notation. 


The simplest way of testing vectors for 
the Gram determinant (see sec. 3.13) 

UiUi U1U2 
j I ^ U2U1 U2U2 

UnUi UnU2 
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If I r I vanishes, the vectors are linearly dependent; if | F | > 0, linearly 
independent. 

When n linearly independent vectors have been chosen they form an 
n-dimensional coordinate system or basiSy being equivalent to a set of n 
coordinate axes. Any other vector v may then be expressed as a linear 
combination of the chosen vectors Ui, U2, * • *, u„, the linear combination 
being unique. It should be emphasized that there is nothing unique about 
the choice of the basis, for any n linearly independent vectors are suitable 
for that purpose although the most convenient choice, in general, is a set 
of unit vectors. The latter are defined by the relations 

ei = { 1,0, 0,0,- • -,0} 

62 ^ j0,l,0,0, - -,0j 

63 = jo,o,i,o,---,o} 

e„ = jo, 0,0,0,- • -,1} 

or similarly as row vectors. Clearly they arc of unit length and mutually 
perpendicular, for 

e,ey = a,, (10-14) 

In terms of the unit vectors, any vector x may be written 

X = XiQi + 0:262 + * • • + Xnen (10-15) 

If the origin of x is taken as coincident with the origin of the basis formed 
by the e^, the components of x are the coordinates of the terminus of x. 

It is often necessary to use a particular set of linearly independent 
vectors as a basis, constructing from them a set whose members are mutu- 
ally perpendicular and of unit length. This procedure, known as SchmidVs 
orihogonalization method is effected in the following way. Suppose the n 
given vectors are Ui, U2, • • •, u^. Select any one of them, say 
Ui, and let Vi = Ui, Ci = Vi//i, where h is the length of Vi. Now 
take V2 = U2 — C2iei, choosing C21 so that eiV2 = eiU2 - 0216161 == 0 
which requires C21 = eiU2 or V2 = U2 - (eiU2)6i. If we put 62 = V2//2, 

where I2 is the length of V2, we shall have 6162 = 5i2. Next let 

V3 = U3 — caiCi — C3262, determining the constants so that eiV3 = eiU3 — 
C31 = 0 and e2V3 = e2U3 — C32 = 0, which means that C31 = eiU3 and 

C32 = e2U3. Finally let 63 = Continuing in this way, we may 

construct the complete set of n unit vectors with 

6n+l ^ "J ) Vn-fi = Un+l (6^11^4-1 ) 6 a: 

tn-fl ^=1 

Problem. Consider the columns of the matrix of Problem a, sec. 10.5, as the com- 
ponents of four vectors. Test them for linear dependence. 



299 


LINEAR EQUATIONS 


10.9 


10.9. Linear Equations. — Matrix methods are useful in solving and 
discussing linear equations of the form 

+ ^ 12^2 + * * * + AinXn = Vl 
^ 21^1 + ^ 4 . 22^2 + • • • + A2nXn = 2/2 


“f“ ^n2^2 -f” ‘ * * “f" *^nn^n “ 2/n (10“16) 

which are inhomogeneous}^ They may also be written as 

i4x = y 

The corresponding homogeneous equation is 

>lx = O 


The matrix A and the vector y are to be considered as known while the 
n components of x are unknown. The questions of chief interest concern 
the number of possible solutions and the method of finding them. Several 
cases arise depending on the rank of but for our purposes we consider^' 
only three possibilities. 


a. y 5*^ 0; \ A \ 9 ^ Q. 


According to (10), A" 
A-i 

X = A ^y = -p— r 


exists, hence 


is the unique solution. From the definition of A and the rule for matrix 
multiplication, it also follows that 


X, = iy\A^' + J/2-4^* + 1- 

which is commonly known as Cramer^s rule}^ 

b. y = 0; I A I 0. The only solutions are the trivial ones Xi = 

X2 = • • • = Xn = 0. 

c. y = 0; I A I = 0; A 7*^ 0 for at least one value of i and fc. If we 
knew the value of one of the unknowns we could find the values of the 
remaining (m — 1) unknowns, since we could then form from the original 
set (n — 1) inhomogeneous equations with non-vanishing determinant. 
In other w'ords, we are confronted in case (c) with n unknowns but only 
(n — 1) equations. However, we note that the fc-th row of our set of 
n equations is 

Ak\X\ + Afc2X2 + • * • + AjcnXn = 0 (10-17) 


See sec. 2.5 for the meaning of the term homogeneous. 

The others are discussed by B6cher, M., “ Introduction to Higher Algebra," 
Macmillan Co., New York, 1907. 

In actual calculations, it is usually simpler to solve (16) by direct elimination 
(cf. sec. 13.26). 
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SO that if we take 

a:*- = (10-18) 

where c is any constant, it follows from (3) that (17) is satisfied. Even if 
j = kj we still have a solution, for (17) is then identical with (2) but 
\ A \ =0. We thus have an infinite number of solutions of the homo- 
geneous equation when | ^ | = 0 as j may take any value from 1 to n and 
c is completely arbitrary. Of course, some of the solutions (18) may be 
worthless, since several of the cofactors may vanish; but it will be found 
that there are always enough non-vanishing ones so that the ratio of all the 
unknowns is determined!^^ The fact that the set of homogeneous equa- 
tions i4x = O possesses non-trivial solutions only when | ^ | = 0 is of great 
importance in many problems and will often be used in the next chapter. 

Problem a. Sometimes chemical analysis must be done in an indirect way. Solve 
the following problem by means of determinants. A mixture of sodium chloride, 
sodium bromide, and sodium iodide weighed 0.5000 gram. Upon the addition of silver 
nitrate, the mixed silver halides weighed 1.0369 g. The iodine in the mixture was pre- 
cipitated as palladous iodide, which weighed 0.3006 g. Find the composition of the 
original mixture. 

Ans, NaCl, 50%; NaBr, 25%, Nal, 25%. 

Problem b. Solve the set of linear equations which result from application of 
Kirchhoff’s laws to the network known as a Wheatstone bridge, out of balance, and obtain 
the current through the galvanometer. Label resistances Ri, R 2 , Rs, Ri, going clock- 
wise around the “ diamond." Let galvanometer resistance be Rg. 

^ _ {R 2 RA “■ R\Rz)E 

^ R\R2Rz -f" R 2 RZR 4 , "f“ RzRaRi ~h RaRiRz “h Rg{Ri 4“ R2){Rz “b Ra) 

where E is the external electromotive force. 

10.10. Linear Transformations. — Suppose two coordinate systems are 
described by unit vectors e and e', the relation between the two sets being 
given by a linear transformation like eq. (4.4) : 

e< = (10-19) 

k 

Then if a given vector has the components X 2 , • • •, x' in one system and 
^ 2 y * • *> in the other system, we have 

X = 0:161 + 0:262 H + o:n6n = x' = o:'i6'i + 0:262 H h o:'6' 

The last member of this equation may be expanded with the use of (19), 
yielding 

^^AkiX{^]Q 

i,k 

The proof is given by B6cher, p. 4. 
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Equating the coefficients of e* on left and right, 

i 

which may also be written 

X = Ax' (10-20) 

Provided ] A ] 0, we may find A“*, hence 

A-4 = A~^Ax' = x' (10-21) 

We may thus use (20) or (21) to determine uniquely the components of 
the same vector in either of two bases. 

It is sometimes useful to think of this transformation in another way. 
Suppose we had only one basis. Then (20) could be regarded as a trans- 
formation that changes a vector x' into a new vector x. Relative to the 
given basis, the vector x^ is said to correspond to another vector x, the new 
vector being obtained from the old one by twisting, rotation, stretching or 
other kinds of distortion. Matrix multiplication follows easily from this 
viewpoint, for if x = i4x' and x' = Bx", then AB = C is a matrix which 
transforms x'^ directly into x. 

Problem a. Derive the law of matrix multiplication from the viewpoint of 
linear transformations. Hint: Write x = Ax' and x' = Bx" as n simultaneous linear 
equations and find x = Cx". 

10.11. Equivalent Matrices. — Let P and Q be non-singular matrices. 
Then A and B are said to be equivalent when 

B ^ PAQ (10-22) 

Equivalent matrices have many properties in common as the subsequent 
discussion will show; their importance is due to the fact that it is often 
possible by means of a linear transformation like (22) to find an equivalent 
matrix which has simpler properties than the original one. When the 
equivalent matrix is in its simplest form, usually diagonal, it is said to be 
canonical.^® The problem of finding an equivalent matrix of canonical 
form is analogous to that of finding a suitable coordinate system in ordinary 
scalar algebra (cf. Chapter 5). 

Several special cases of equivalent matrices are possible, depending on 
the nature of the matrices P and Q effecting the transformation. 

a. li PQ = Ey then 

B = Q-^AQ (10-23) 

These two interpretations of relation (20) are sometimes called the passive and the 
active interpretation. 

Canonical matrices are exhaustively discussed by Turnbull, H. W., and Aitken, 
A. C., ** The Theory of Canonical Matrices,” Blackie and Son, London, 1932. 
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The transformation is called collineatory or a similarity transformation 
{Ahnlichkeitstransformatio7i). The two matrices A and B are said to be 
the transforms of each other. 

b. If P « Qy the transformation is called congruent: 

B ^ QAQ (10-24) 

c. If P = then 

B = Q^AQ (10-25) 

the transformation is conjunctive. If the matrices are all real, this becomes 
identical with (24). 

d. If PQ = £, P = Q (i.e., Q is orthogonal) and all the matrix ele- 
ments are real, then 

B = QAQ = Q~^AQ (10-26) 

represents a real orthogonal transformation. It is both collineatory and 
congruent. 

e. If the matrix elements are complex and PQ=E, P = Q^ = 

(i.e., Q is unitary), then 

B = Q^AQ = Q-^AQ (10-27) 

is called a unitary transformation. It is collineatory and conjunctive. 

10.12. Bilinear and Quadratic Forms. — A homogeneous polynomial of 
the second degree in 2n variables xi, X 2 y • • *, Xn] yi, 2 / 2 , * * s 1/n is called a 
bilinear form. It may be abbreviated as 

A(x,y) = Xi4y = T^AijXiyj (10-28a) 

*,} 

where A = [Aij]. If both x and y undergo non-singular transformations 

x = Px'; y = Cy' 

then 

A(x,y) = x'PAQy' = x'By' = ^'(x'.y') (10-29) 

If P = C”*) so that X = Q~^x'; y = Qy' then x and y are called contra- 
gradient variables since they undergo opposite transformations. 

As a special case of a bilinear form suppose x = y. Then the coeffi- 
cient of Xix) {i 9 ^ j) in (29) is {Aij + Ay^), and the matrix A becomes 
symmetric if we write (Aij + Aji)f2 for every Aij and Aji. Eq. (28a) 
may then be written 

n 

il(x,x) = = Xilx; A = A (10-28b) 

Such a function is a quadratic form; if it is positive for all real values of 
the variables, it is called a 'positive definite quadratic form; if it is positive 
or zero, it is called semi-definite. 
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10.13. Similarity Transformations. — Suppose the same vector is called x 
when referred to the basis e and x' when referred to another basis e'. 
Let another vector be y or y', where 

X = y = Oy' (10-30) 

Such variables which undergo the same transformation are said to be 
cogredient. Now consider the transformation 

X = >ly (10-31) 

which changes y into x in the basis e. Then, 

Qx' ^ Ay = AQy' 

or, if Q is non-singular 

i' = Qr^AQy' = By' (10-32) 

Hence, (32) is a transformation which changes y' into x' in the basis e' 
while i4, the transform of performs the corresponding transformation 
from y to X in the basis e. This is the reason for the name similarity 
transformation. 

An alternative interpretation of the transform may be given. Let 
X) y> y^ four different vectors all in the same basis and represented by 
four points as in Fig. 1, Then (30) changes x' into x and y' into y while 
(31) changes y into x. The single transformation that changes y' directly 
into x' is (32), since x' = ^■“^x = Q~^Ay = Q~^AQy\ 



10.14. The Characteristic Equation of a Matrix. — If X is a scalar pa- 
rameter, A is a square matrix of order n and E the unit matrix of the 
same order, the matrix 

= (Xi? - A] 

is called the characteristic matrix of A. The equation 

K{\) = iii:l = lxB-Al=0 


(10-33) 



10.16 


MATRICES AND MATRIX ALGEBRA 


304 


or its equivalent 

K(\) = + aiX^-^ + ^ . . . + ^ 0 (10-34) 

where the are functions of the elements of 4, is the characteristic equation 
of A. The n roots of K{\)^ Xi, X 2 , X 3 , • • • , Xn, not necessarily all different, 
are the characteristic (or latent) roots. On writing (34) in the form 

(X ~ Xi)(X - X2)(X - X3) • • • (X - Xn) = 0 

and comparing coefficients of the different powers of X it will be seen that 

Xi + X2 + • • * + Xn = — tti 
X 1 X 2 + X 1 X 3 + * * * + Xn— iXn = 02 
X1X2X3 + * * • + Xn— 2^n— iXn = ”"^3 


X1X2X3 • * • Xn = (~ l)^Un 

If 5 = then [\E ~ B] = [XE ~ Q'^AQ] = Q'^^lXE - A]Q, 

Moreover, 

I B 1 = I Q-i II XE - ^ II Q I = I XE - ^ I (10-35) 

Hence two matrices related by a similarity transformation have the same 
characteristic roots. 

We leave the proof of the following statements to the reader. 

Tr Q^^AQ ^TtA 

I Q-^AQ I = I A I 

If C = 4 X the characteristic roots of C are the products, taken in 
pairs, of the roots of A and B. 

Problem. Prove the statements of the preceding paragraph. 

10.16. Reduction of a Matrix to Diagonal Form. — Consider the linear 
transformation 

i4x = Xx (10-36) 

The only effect of the matrix A on the vector x is to multiply it by the 
constant scalar factor X. Rewriting (36) in the form 

[XE - A]x ^ Kx = O (10-37) 

we see. that, except for the trivial case where all the components of x are 
zero, I K | must vanish (cf. sec. 10.9b, c). Hence, as shown in the previ- 
ous section, X can only take the values Xi, X 2 , • • •, Xn where Xi is one of the 
characteristic roots of K{X), These quantities are the eigenvalues of the 
matrix 4; the accompanying sets of vectors x are the eigenvectors. Eq. (36) 
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is the matrix form of an eigenvalue equation, other examples of which were 
discussed in Chapter 8. 

Now suppose 5 is a diagonal matrix; then the roots of its characteristic 
equation are identical with its diagonal elements. If i4 is not a diagonal 
matrix but is related to S by a similarity transformation, B = Q~^AQj then 
it lollows from (35) that its characteristic roots and equation are the same 
as those of B. The problem of reducing A to diagonal form by means of a 
similarity transformation is thus closely related to the problem of finding 
its eigenvalues. We shall now show how such a reduction may be made. 

The eigenvalues themselves must first be obtained^® by solving (34). 
Having found the Xi, we wish to determine a matrix X such that 

JT^AX = A = [XA,] (10-38) 

We distinguish the following two cases. 

a. The Eigenvalues are all Different Let us consider the case in which 
all eigenvalues of A are different. Select one, say X*, and form the n linear 
equations 

Ax = XiX (10-39) 

They are homog^eous, but as shown in sec. 10.9, we may solve them for 
the ratio of the components of the eigenvector x^. Remembering that 
each component contains an arbitrary constant, we write them as a column 
vector 

X/f = \x\'fcyX2k)’ ’ 

The remaining eigenvectors are determined in the same way using each 
eigenvalue in turn. Finally we form a matrix X whose columns are the 
eigenvectors of A, This matrix clearly satisfies the equation 

AX = X[\di,] 

When this result is multiplied by X~^y eq. (38) is obtained. We have thus 
shown that the matrix X which diagonalizes Ay may be found by com- 
pounding the eigenvectors of A into a matrix. The reduction to diagonal 
form here described is unique except for the order in which the eigenvalues 
occur along the diagonal. 

Although not required by the method, the eigenvectors may be orthog- 
onalized and normalized by the Schmidt process, which fixes the unde- 
termined constant appearing in the solution of (39). We return to this 
question in sec. 10.17. 

b. The Eigenvalues are Not all Different. When two or more of the 
eigenvalues of A are equal to each other, reduction to true diagonal form 
is not always possible. Suppose Xi is an eigenvalue of A repeated ri times. 

Numerical methods of finding them are discussed in sec. 13.28. 
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Proceeding as before, we find an eigenvector Xi so that 


i4xi = XiXi 

Then if Xi is the first column of a square matrix X, the first column oi AX 
wiH be XiXi and the first column of X~^AX will consist of Xi followed by 
(n — 1) zeros. Call this matrix B: 


B = Xr-^AX = 



Here B j is a row matrix with (n — 1) elements and Bij is square of order 
(n — 1). Since B is the transform of i4, it also has the eigenvalue Xi 
repeated ri times, but Bij contains that eigenvalue only (ri — 1) times. 
This matrix is subjected to the same procedure as : we find an eigenvector 
and form a new matrix Y whose first column is that eigenvector. Note 
however that Y has only (n — 1) rows and columns and 

■ [ 0 : SJ ^ 

SO that the matrix 

[;;i 

will transform B into the form 


“Xi BjY ‘ 

0 Xx = Ck 


Continued applications of similar transformations will eventually result 
in a single matrix Z such that 

(10-40) 

where 


A, = 


Xl Hi2 Hi3 
0 Xl H23 

0 0 Xl 

0 0 


//ir. 

//2r. 


(10-41) 


Xl 


The matrix F is rectangular with ri rows and (n — rO columns while G is 
square and of order (n - ri). Now if /i is a rectangular matrix composed 
from the first ri columns of Z, then we may remove the unwanted matrix F 
from (40), for 

Zr^AZi = Ai 
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The next step is to treat the matrix G in a similar way until it is reduced 
to the form of Ai, with its eigenvalue X 2 along the diagonal. We then con- 
tinue with each remaining matrix until every eigenvalue has been used. 
Finally if we join together all of the rectangular matrices Zi to form a 
square matrix W, we will have 

AW Ml, ^2,* -A) 

where r is the number of distinct eigenvalues of A. Note that we are using 
the notation of sec. 10.7 to denote a diagonal matrix, but in this case the 
diagonal elements Ai are really matrices themselves. Each is of the tri- 
angular form of (41). 

In the general case, it is possible to make a further transformation so 
that the eigenvalues occur along the diagonal of At, while unity appears in 
each positron immediately above the eigenvalue and zero elsewhere.'^ 
In the special cases where A is symmetric, Hermitian, or unitary, the non- 
diagonal elements of the triangular matrices may be completely removed 
so that the final form is truly diagonal. We consider these cases in secs. 
10.17, 10.19, 10.20. 

Problem. Reduce to diagonal form 

“8 -8 - 2 l 

4 —3 —2 I . Ans. Xi = 1; X 2 — 2; Xs ~ 3. 

-4 ij 

10.16. Congruent Transformations. — When a change of variable, 
X = Qy is applied to a quadratic form, (28b) becomes 

A(x,x) = Xi4x = fOAQy = yBy 

Thus the transformation of the matrix A which corresponds to this change 
of variable is congruent. Its importance is due to the fact that by its 
use a quadratic form may be reduced to a sum of squared terms, as will 
now be shown. Provided B is diagonal, yBy will be a sum of squares. 
Hence our problem is that of diagonalizing the symmetric matrix A by 
means of a congruent transformation. Suppose An in is not equal to 
zero. Then A may be written as 



where A^ is the matrix obtained from A by striking out the first row and 
first column and V = [Ai 2 ,Ai 3 ,- • -Min]. Now let 



The proof requires the theory of elementary divisors; see Turnbull and Aitken, 
loc. cit. 
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where En-i is the unit matrix of order (n — 1 ) . Then 

= [q " ®//] 

and is a matrix of order (n — 1) whose elements are 

Hij = Ai^ij^i 

The matrix A" may be treated in the same way and the process continued 
until A is completely reduced to diagonal form: 

QAQ = diag (ai,a 2 >- • 

The final result is 

Xilx = |i)| = + a2^2 + • • • + otfi^ 

with 

D = QAQ ^ [a.Si;] 

Q == QiQ2Q3 ‘ • Qn 

and 

X = Q| 

The matrix Q will have the form 


'1 

’-A12/AII — 

^13/^ 

11 • • • 

-^InMii 




0 


1 

0 

. . . 

0 





0 


0 

1 

• • • 

0 


X 



_0 


0 

0 


1 





'l 

0 

0 


0 


"i 

0 

0 ••• 

o'] 

0 

1 

—B23/B22 


^ 2 n/ B22 


0 

1 

0 • • • 

0 

0 

0 

1 


0 


0 

0 

1 • • • 

0 

_0 

0 

0 


1 


_o 

0 

0 ••• 

1 _ 

"l 

Qi 2 

0l3 • • • 

Qln 







0 

1 

Q23 • • • 

Q 2 n 







0 

0 

1 . . . 

Qsn 







0 

0 

0 • • • 

1 








The determinants, Amj formed from A by omitting all but the first 
m rows and columns are called the discriminants of the quadratic form. 
Moreover, as the reader may show (Problem b), 

a\ = Ai; 0^2 = A2 /Ai; aa = A 3 /A 2 ; • • •; an == An/An-i 
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If it is so desired, a further linear transformation rji = will reduce 
xAx to the form 

i\Er\ = TJI + 7?2 + • * * + 

Assuming that no element An is zero, we note that instead of starting 
with All at the first step, as we have done here, any of the remaining 
An, (n — 1) in number might have been chosen. At the second step, 
there are (n — 2) choices available and so on. Thus the final forms of Q 
and D are not unique. When some of the An are zero or when some of the 
discriminants vanish, modifications^® are required in the method. 

Problem a. Show that forn = 3 , the elements of Q may be taken as Q 12 =* — A12/A11; 
Qi8 = A'VA^^ 023 - A^VA” 

Problem b. Verify the relation given in the preceding paragraph between the 
diagonal element and the discriminant. 

Problem c. Reduce the following expression to a sum of squares: 2x\ -f 7x1 + 
3 xj + 4 xiX 2 -f 8x1x3 — 2x2x3. For one answer, ai = 2 ; a2 = 5 ; a3 = — 10 . 

10.17. Orthogonal Transformations. — In this section we limit our dis- 
cussion to real orthogonal matrices, since we shall have no need for those 
containing complex elements. By definition, if R is orthogonal, R = 
hence RR = RR = E. One of their most important properties arises from 
the fact that transformation by them leaves the length of a vector un- 
changed, Suppose X and y are related by an orthogonal transformation 

X = Ry; X ^ yR 

then 

XX = yRRy = jy 

Our assertion is proved since xx is the square of the length of x and 57 is 
the square of the length of y. On expanding RR = £ we find that 

'LRpsRq, = T^RspRsq = ^pq (10-42) 

«■=! ««1 

These relations are the necessary and sufhcient conditions that a matrix 
be orthogonal. 

From the definition RR = E, we also see that | 72 | X [ 5 | = | .B |* •= 1, 
hence 

|Bl =±1 

Let us consider two matrices 


COS 0 

sin 0 

o' 


cos 0 

sin 0 

o' 

—sin 0 

cos 0 

0 

; /r = 

— sin 0 

cos 0 

0 

0 

0 

1_ 


0 

0 

-1 


These cases are discussed by B6cher, loc. cit. 
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which are easily shown to be orthogonal, the first having the determinant 
+ 1, the second — 1. If we multiply a three-dimensional vector with com- 
ponents Xf t/, z from the left by these matrices, the result using will 
be a rotation by the angle <f) about the Z-axis, while produces a similar 
rotation followed by a reflection in the XF-plane. These two cases are 
called proper and improper rotations. Now the most general rotation of a 
vector in three-dimensional space may be regarded as a combination of these 
two operations; similarly, the most general orthogonal transformation of 
order three is a combination of and if". Transformations by an orthog- 
onal matrix of order n may be considered as a rotation or rotary reflection 
in n-dimensional space. Moreover, we know from geometry that a rota- 
tion leaves the length of a vector unchanged, which we have shown to be 
true for an orthogonal transformation. For the special case, n = 3, 
eq. (42) is identical with (4-2) and (4-3) and the elements of R are the 
direction cosines of three mutually perpendicular axes referred to three 
fixed axes. Finally we note that if x is a vector whose components are 
given in a rotating coordinate system and y is the same vector in a fixed 
system, then the instantaneous relation between the two sets of com- 
ponents is X = Ry. 

The fact that an orthogonal transformation is both congruent and 
collineatory makes it useful for the following reason: It has been seen 
that the congruent transformation may be used to reduce a quadratic form 
A (x,x) to a sum of squares, but the reduction is by no means unique. On 
the other hand, suppose the quadratic form has been reduced to a sum of 
squares by a congruent transformation and the elements of the transform- 
ing matrix are real. They can then be orthogonalized and normalized 
according to (42) and the resulting matrix R is both congruent and collinea- 
tory (hence orthogonal). In symbols, 

X = Ry; A(x,x) = xAx = yR~'^ARy = yAy 

where A is diagonal with the eigenvalues of A for elements. It will be 
remembered that when a matrix is reduced to diagonal form by a similarity 
transformation, the eigenvectors which form the columns of the transform- 
ing matrix X are not completely determined because the equations to be 
solved for the components of the eigenvectors are homogeneous. This 
arbitrariness now disappears, for we must fix the ratio of the components of 
(42) so that the transforming matrix is orthogonal and RH = E. 

We are now in a position to prove a statement made in sec. 10.15, 
namely that if a matrix A is symmetric and has multiple eigenvalues it 
may still be reduced to true diagonal form by an orthogonal transformation. 
Suppose A undergoes a congruent transformation by the matrix Q. Then 

the new matrix QAQ is symmetric if i4 is symmetric, for (QaQ) = ^AQ. 
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It thus follows that an orthogonal transformation will leave the symmetry 
of A unchanged. This only can be true if the off-diagonal elements of the 
triangular matrices of (41) are zero, but then A is diagonal. 

Orthogonal transformations are often called 'princi'pal axis transformor 
tions since they are used in the problem of reducing a conic to principal 
axes and in finding the principal axes of a rotating body, or in reducing 
kinetic and potential energy expressions to sums of squared terms. The 
eigenvectors are frequently called normal coordinates in these cases.^® 

A similar procedure serves to reduce^^ simultaneously two quadratic 
forms to a sum of squares. Suppose the two forms are A(x,x) = xAx 
and B(x,x) = xBx. First reduce A(x,x) to a sum of squares by a congru- 
ent transformation, x = Qy, which will give 

xAx = yO^Qy = yDy = fMoly 
The same transformation applied to B will give 

xBx = yQBQy = yCy 

but C is not diagonal. Now make the substitution tj. = v^^,- which 
results in 

yDy = fj£7i; yCy = ijC'ii 

where the a, have been absorbed into C to give C'. Finally, an orthogonal 
transformation, = /?| will reduce C' to diagonal form, yielding 

yDy = = |F| 

fCy = IRC'R^ = |A| = Xifi + XzJi + • • • + X„?2 

Even when the two quadratic forms are not functions of the same 
variables, the transformation may often be made. For example, in the 
mechanical problem of small oscillations where it is required to find normal 
coordinates for the kinetic and potential energies, the two quadratic forms 
appear as T = vAv and V = xBx where v = dx/dt^ T being positive definite. 
The reduction causes no difficulty since the cogredient variables x and v 
both undergo the same transformation.^^ 

We show in eq. (53) that for a unitary matrix, X^Xf = 1 for every i. 
Since a real orthogonal matrix is also unitary, it follows that the only possi- 
ble eigenvalues for a real orthogonal matrix are ±1 or In the latter 

See Chapter 9; for a fuller discussion, see Whittaker, E. T., “ A Treatise on the 
Analytical D 3 aiamic 8 of Particles and Rigid Bodies,’^ Third Edition, Cambridge Press, 
1927. 

The reduction is not always possible but it can be made if one of the forms is 
positive definite, as is the case in most physical problems. 

An example of this case was discussed in sec. 9.11; see also Whittaker, loc. cit.. 
Chapter VII. 
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case, 4> must be real and the exponentials occur in pairs with opposite 
signs. If A: is the number of eigenvalues equal to — 1, the determinant of 
the matrix equals ( — 1)*. 

In the previous discussion of this section, we have shown that when the 
eigenvalues are real it is possible to reduce matrices to diagonal form by 
means of an orthogonal transformation. Now suppose that i?, the matrix 
to be reduced, is itself orthogonal and of order n, and that the n eigenvalues 
are +1 occurring ji times, —1 occurring times {ji +^2 ^ j ^ n) and 
Since the latter must appear in pairs there are an even number of 
them, i.e., n — j — 2m and A; = 1, 2, • • *, m. Some simplification of the 
final diagonal matrix may be made by noting that if = 0 or tt, e"^**^* 
equals ±1. Thus we may write an even number of the eigenvalues zhl as 
exponentials and obtain the following special cases for A, the diagonalized 
form of /?. 
n eveUf n ^ 2k 

I i2 I = 4-1; A = diag • • •, • • •, e”’**) 

I I = -1; A = diag (1, - 1, ■ • •, e"'**, • • •, «“<**->) 

n odd, n = 2A: + 1 

1 I = +1; A = diag (1, e‘**, e-*>, • • •, e"***) 

I iE 1 = -1; A = diag (-1, e'\ ■ • •, e"’*', • • •, e"’**) 

(10-43) 

Now consider the form of the matrix X which diagonalizes /?: 

X'-^RX = A 

Its r-th column is an eigenvector Xr of R and its eigenvalue will be assumed 
to be But according to (36) 

Xxr = e^^rx, (10-44) 

hence Xr will in general be complex and of the form 

Xr = xj + ix," 

where xl and x^' are real. In a similar manner, it follows that the eigen- 
vector and column of X corresponding to e"^*^** is x^ — ^x^^ We conclude 
that usually the transforming matrix X will have some complex elements. 
Recalling the fact that in the previous case, where X was orthogonal, the 
transformation was both collineatory and congruent we see that the neces- 
sary modification here is that the transformation be collineatory and con- 
junctive. The transforming matrix, then, is unitary, hence we can only 
diagonalize an orthogonal matrix by means of a unitary matrix. 

Let us see what would happen if we transform a real orthogonal matrix 
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R by another real orthogonal matrix S. 

Sr^RS = Z 


We write (44) as 

X(x'r + IXr') *= + ix^') • (10-45) 

for the particular eigenvalue Since = cos </)r + i sin <l>r we may 

equate the real and imaginary parts of (45) to get 

XXr = COS <t>r — x'' sin <f>r 

Xxl^ = X^ sin (t>r + x^' cos <t>r 


with a similar expression for the column of X that comes from If 

we replace the complex eigenvector Xr = x^ + by x^ and x^ — ix^' by 
x^' the resulting matrix contains only real elements and may be made 
diagonal by requiring that (42) be fulfilled. Let us call this matrix S. 
Transformation by it will give the following forms for Z: 

n even^ n = 2k 

1 /? I = +1; Z = diag (Ci,C 2 ,- • -^Cjk) 

I -R I = 1; Z = diag (1, l,Ci,C 2 ,’ • ',Ck^i) 

n odd, n = 2k + 1 

I -B I = +1; ^ = diag (l.Ci,C 2 ,- • -.Ct) 

1 B I = 1; Z = dia>g ( l,Ci,C2,' • ’jCt) 

where C. - f '“f “” **] 

— cos</)jtJ 

It is worth while to point out that the only other possible two-dimen- 
sional real orthogonal matrix is of the type 

[ cos <f> sin <^1 
sin (p —cos </>J 


Its eigenvalues are ±1, hence such matrices cannot occur in the reduced 
form of R as we have already included all real eigenvalues in the preceding 
expressions for Z. 

Problem. Prove that and R" are orthogonal matrices. Reduce each to 
diagonal form 


10.18. Hermitian Vector Space. — Since many of the matrices occur- 
ring in physical problems^^ contain complex elements, it is necessary to 

For the use of matrix theory in quantum mechanics, see Chapter 11. For further 
discussion, see Bom, M., and Jordan, P., “ Elementare Quantenmechanik,'^ J. Springer, 
Berlin, 1930; Wigner, E., “ Gruppentheorie und ihre Anwendung auf die Quanten- 
mechanik der Atomspektren,” Vieweg, Braunschweig, 1931. 
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amplify the vector concept presented in sec. 10.8. We write in place of 
(11), the Hermitian scalar 'product 

x+y =r x^yi + xty2 + * * * + x^yn (10-46) 

The square of the al)solute length of a vector is then real, 

X+X = x\xi + ^2^2 + • • • + X*Xn 

If x^y = y^x = 0, the two vectors are orthogonal or mutually perpendicu- 
lar. If x^x = 1, the vector is a unit vector or normalized. For a scalar a, 

x^ay = ax^y; (ax)+y = a*x^y 

The Hermitian scalar product is associative 

x^(y + z) = x^y + x^z 

If A is any matrix, 

xMy = (i4^x)^y; {AxYy = x^(i4V) (10-47) 

10.19. Hermitian Matrices. — If the variables in the bilinear form (28) 
are complex conjugate to each other and if its matrix is Hermitian, the 
form is called Hermitian, Thus, 

H(x,x) = ^HijX%Xj = x^Hx) Hii = H% (10-48) 

iiJ 

In spite of the fact that the elements of (48) are complex, the form itself 
is real. 

The eigenvalues of an Hermitian matrix are also all real. Suppose X»- 
is an eigenvalue corresponding to an eigenvector x, then 

Hx = XiX; x^/fx = XiX+x 

Since x^Hx and x^x are both real, it follows that Xi is real. 

An Hermitian matrix H remains Hermitian when transformed by 
either an orthogonal or a unitary matrix. To prove this statement for a 
real orthogonal matrix i?, suppose Hi is known to be Hermitian and 
Er^HiR = H 2 . Then, since R = we have H 2 = RHiR~^ = HJliR 
and Hi = R^HlR*. But Hi = Hi and R is assumed to be real, so R^ = 3^, 
R* = R, Thus HI = = R~^HiR = H 2 . The proof for a unitary 

matrix is similar. 

As we have previously stated, a real symmetric matrix is a special case 
of the Hermitian matrix. Thus, except for slight modifications, the reduc- 
tion of Hermitian matrices to diagonal form is similar to the procedure 
used for real matrices. For example, an Hermitian form may be con- 
verted to a sum of squares in many ways by a conjunctive transformation. 
On the other hand, we saw in sec. 10.15 that a matrix could be converted 
to diagonal form, with its eigenvalues on the diagonal, by means of a col- 
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lineatory transformation. If the matrix is Hermitian, we may require 
that the transformation be both collineatory and conjunctive, hence uni- 
tary, and the diagonal form is then unique. The same argument which 
was used for a real symmetric matrix shows us that even if the eigenvalues 
are not all different, the true diagonal form may be obtained since trans- 
formation by a unitary matrix leaves the symmetry of an Hermitian matrix 
unchanged. 

The necessary condition that two Hermitian forms be simultaneously 
reducible to a sum of squares is that they commute.^^ Suppose that 
//(x,x) = x^Hx and X(x,x) = x^Kx are given and that both H and K are 
Hermitian or unitary. Let S be a unitary matrix that reduces H and K 
simultaneously to diagonal forms, and K ' : 

H' = S-^HS] K' = S-^KS 

Clearly and K' commute since they are both diagonal, hence we may 
write 

H'K' = S-^HSSr^KS = Sr^HKS 

= S-^KSS-^HS = S^^KHS 
or 

S^^HKS = S^^KHS 

since K'H' = H'K\ It thus follows that HK = KH. 


Problem. Prove that an Hermitian matrix remains Hermitian after transforma- 
tion by a unitary matrix. 


10.20. Unitary Matrices. — If we indicate a unitary matrix by t/, then 
from its definition 


U = (Ut)-i 

hence 

IT ^ UW ^ UW = E (10-49) 


Suppose the elements in a single column of U are given by Uy, then the 
Hermitian scalar product of two columns 

= 5jk 

A similar relation may be found between the rows. Hence the rows and 
columns of a unitary matrix of order n form a set of n mutually perpendicu- 

The sufficiency of this condition is proved by Weyl, H., “The Theory of Groups 
and Quantum Mechanics,” Methuen, London, 1931. 

If these matrices were not Hermitian or unitary, neither of them could be reduced 
to diagonal form (unless all eigenvalues were distinct, a case which is not very 
interesting). 
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lar unit vectors in Hermitian space. This may be seen at once by writing 
(49) explicitly: 

= i:U*pU„ = (10-50) 

8 8 

These equations are analogous to (42) for orthogonal matrices. 

If X and y are any two vectors, 

{Ux)Wy = x^(WUy) = x+y 

hence a transformation by a unitary matrix leaves a bilinear or quadratic 
form invariant. In particular, if x = Z7y, then 

x+x = (Uy)Wy = y^y 

This is the analogue of the fact that an orthogonal matrix in real vector 
space leaves the length of a vector unchanged. In fact, the unitary matrix 
in Hermitian vector space is the generalization of the orthogonal matrix 
for real vector space. 

The product of two unitary matrices U and V is also unitary: 

{uvy = ri/t = v-^ir^ = (uvy^ (10-51) 

The reciprocal of a unitary matrix is unitary : 

(ir^y = ^ u = (10-52) 

The eigenvalues of a unitary matrix may be real or complex but of 
absolute value 1. Suppose \ is an eigenvalue of C/, then 

Ux = Xix; {UxyUx = x^x = XiX*x^x (10-53) 

Since x^x is real and does not vanish, X^X* = 1. 

A unitary matrix may be transformed into diagonal form by another 
unitary matrix K, the diagonal elements being the eigenvalues of U, The 
procedure is similar to that for similarity and orthogonal transformations. 
The eigenvectors must be normalized to satisfy U^U = E, The result is 

V-^UV == FW - A = diag (Xi,X 2 ,- • 

10.21. Summary on Diagonalization of Matrices. — The matter of 
diagonalizing matrices is so useful in practice that a final and simple 
statement regarding conditions for the feasibility of this reduction seems 
in order. 

A matrix may be diagonalized (a) if all its eigenvalues are distinct (for 
procedure,, see sec. 10.15a), (b) if it is Hermitian or symmetric (see sec. 
10.16 and 10.19), (c) if it is unitary (see sec. 10.20). In cases (b) and (c) 
a unitary matrix can always be found to effect the transformation while in 
(a) a more general type of transforming matrix will be needed. 
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QUANTUM MECHANICS 

11.1. In conformity with the scope of this book, the emphasis of the 
present chapter is on the mathematics of quantum mechanics, the physical 
ideas entering the discussion only in a secondary way. Limitation of space 
further demands that only the important, and this happily implies the more 
elementary, portions of the wide field be presented. Complete exclusion 
of physical ideas would, however, leave its subject matter so poorly joined 
and so incomprehensible to the student who has no prior knowledge of 
quantum mechanics that the value of an entirely formal treatment appears 
questionable. It is also true that no part of applied mathematics exacts 
from its student a more radical change from his customary habits of 
thought, a greater tolerance for new methods of inquiry, than does this 
latest branch. In order to provide the proper attitude of mind, we preface 
the later mathematical developments by a few qualitative remarks whose 
relevance to the present book is but auxiliary. 

The central notion of classical mechanics is the mass point, or particle. 
Classical theory therefore presupposes, tacitly, that a physical system can 
in principle be recognized as a particle, or a set of particles. Until the 
advent of quantum physics this dogma has never been questioned; in 
fact scientific philosophers have frequently inflated it to the dimensions of 
a universal proposition claiming that all physical systems are composed of 
particles. The method of physical description in best accord with this 
fundamental attitude is clearly this: To correlate instantaneous positions 
of a given particle with instants of time, assuming motion to be continuous 
in space and time. Thus, if a particle moves along the X-axis, the com- 
plete description of its motion would appear in the form x = /(O- 

Now it is conceivable that such a correlation becomes impossible, and 
the question then arises whether this fundamental mode of description 
should be abandoned in such circumstances. The answer which has often 
been given and which the modern physicist emphatically rejects is the flatly 
negative one, the answer alleging that classical description is intrinsically 
evident and that the relation x = f{t) has meaning even when the func- 
tional relation cannot be established. On the other hand, one would not 
like to discard this successful description lightly, for instance because of 
certain practical and accidental diflSculties in the procedure of measuring x 
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as a function of t. The criterion which has ultimately produced clarity is 
this: A method of description must be abandoned when it becomes impossi- 
ble, not because of experimental difficulty, but because its use contradicts 
known laws of science. Classical description has become impossible for the 
latter reason, as the following simple example will show. 

Imagine an oscillating mass point, e.g., the bob of a pendulum. As 
long as the eye can follow the bob, correlations between x and t can cer- 
tainly be made. But suppose the mass point is made to increase its fre- 
quency of vibration. The eye will soon be unable to perceive instantane- 
ous positions, but the camera can still establish them. When the camera 
fails, oscillographic methods may be available, and after that, ingenious 
devices perhaps not yet invented may serve. But ultimately, a barrier 
of an essential kind will be encountered. Let us assume that the bob 
oscillates 10^® times per second. It is a fact of atomic physics that visible 
light requires about 10“^ seconds to be emitted (or reflected). Thus if it 
were used as the medium of report, the light-emitting mass would have to 
remain in a given position for approximately that length of time. In the 
present instance, however, the bob executes 100 vibrations within this 
period. A similar argument can finally be used to invalidate every other 
means for establishing the classical correspondence. The latter has to be 
ultimately abandoned because its use contradicts the laws of optics. 

What, then, can be done? Perhaps the example suggests an answer. 
While a snapshot can in principle no longer be taken of the rapidly oscillat- 
ing bob, a time exposure would reveal some features of its dynamical 
behavior. It would give essentially a correlation between the time the 
bob spends within a given interval dx and the location of that inter- 
val, in other words between x and the probability wdx of encountering it in 
dx. This leads to a less pretentious description of the physical system 
called a mass point, of the form w = p(a:), and this description is charac- 
teristic of quantum mechanics. It is to be noted that p(x) can be inferred 
from the classical relation x = /(O, but not/(0 from w = p(x). 

Quantum mechanics provides the means for deducing probability rela- 
tions of the type described, and it does so in a logically consistent fashion. 
But before turning to this central issue, let us see what has become of the 
concept: particle. Our time exposure has left it very ill defined. Indeed 
if the system called a mass point were invisibly small or never sufficiently 
stationary to permit the classical description, the customary properties of 
particles would never be exhibited. By the criterion of essential observa- 
bility, the concept would lose its physical significance. From a misunder- 
standing of this situation there has arisen a claim that quantum mechanics 
leads to a dualism, to the monstrous conception that ultimate entities of 
physics like electrons are both particles and waves ; the correct statement 
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is that they are neither particles nor waves, but more abstract entities for 
the description of which quantum mechanics gives most simple and success- 
ful rules. The question as to the particle or wave nature of an electron 
must be put in the same class as that regarding its color — or, to use a lighter 
metaphor due to the philosopher Dingle, as the question concerning the 
color of an elephant^s egg if an elephant laid eggs. 

Despite this fundamental situation we shall place no ban upon the use 
of the terms particle, wave, etc. ; we shall even adhere to universal practice 
in calling the electron one of the elementary particles of nature ; we do this 
only, of course, as a concession to usage. But whenever a paradox arises, 
the reader should endeavor to resolve it by recalling that the classical 
language ” when applied to atomic entities is in fact metaphoric. 

AXIOMATIC FOUNDATION 

11.2. Definitions. — For the sake of brevity all historical considerations 
are omitted here. Nor will any attempt be made to “ deduce quantum 
mechanics either from classical physics or from outstanding experimental 
facts, for in a strict logical sense this cannot be done. We shall, however, 
present the framework of the theory with utmost economy of thought and 
space, committing the reader to the tacit understanding that all experi- 
mental consequences of the theory outlined have been verified as far as 
they could hitherto be tested. 

On a physical sysiern^ by which is meant any object of interest to physics 
or chemistry, numerous observations or measurements can be made. The 
quantities so observed or measured, such as size, energy, position and 
momentum, are called observables. It is well to think of these observables 
without ascribing to them the intuitive qualities they possess in classical 
mechanics. Position, or energy, is not so much possessed by a system as it 
is characteristic of a certain measuring process which can be carried out upon 
it. The measurement of an observable upon a system yields a number. 

In defining the state of a physical system considerable caution must be 
exercised, for we wish to remain in keeping with the requirements outlined 
in the introductory paragraphs. First it is well to notice that by state the 
scientist never means anything not subject to arbitrary fixation; indeed 
the definition of state is made to conform to the needs of each particular 
subject. It is quite different, for instance, in classical mechanics from 
what it is in thermodynamics or in electrodynamics. Hence we need not 
feel ill at ease when in quantum mechanics a new choice is made. Leaving 
elucidation until later : a state is^ a function of certain variables^ a function 

^ The reader who dislikes this phrase may substitute “ is represented by ” for the 
simple “ is.’^ We wish to warn, however, that the spirit of quantum mechanics permits 
no distinction in meaning between these two expressions. 
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from which by the rules of quantum theory significant information can be 
obtained. The variables may be chosen in several ways, each giving rise to 
a consistent description equivalent to all others; here they will be taken to 
be space coordinates, for this gives rise to the form of quantum mechanics 
most commonly used, namely Schrodinger^s. By state, or state function, 
we thus mean a mathematical construct, X 2 yy 2 j^ 2 ] * * ‘ ^n,yn,2n)- 

It is possible, as we shall later see, to associate the variables Xi • • • 2n with 
the dimensions of configuration space of the classical analogue of the system 
in question. In particular, the number of variables needed in <j) for a 
complete description of its .behavior (at a given instant of time) has always 
been found to be equal to the number of its classical degrees of freedom. 
This must indeed be the case in order that large scale bodies be consistently 
described both by quantum mechanics and by classical mechanics. States 
may change with time ; hence a state in its widest meaning may be written 

<t>(XiyiZi • • • Znyt) 

Certain restrictions are to be placed upon state functions, restrictions 
which will take on greater plausibility in view of the postulates of the next 
section. Most important among them are two: first 0, which may be a 
complex function, must possess an integrable square^ in the sense that 

J' (t)*(t>dT < CO (11^1) 


where dr is the volume of configuration space, 
coordinates 


Second, 


dr = dx\dy\dzi • • • dxndyndzn 
<t> is single-valued 


i.e., in rectangular 


( 11 - 2 ) 


The function <f> may of course be expressed in any other system of space 
coordinates by the ordinary geometric transformations of Chapter 5. 
Condition (2) is particularly important when one of the variables is an 
angle, say a, for it then requires that 

<l>{a) = <#>(q: + 2n7r) (11“3) 


n being an integer. 

Finally we must include in our list of definitions another mathematical 
construct, that of an operator. Every specific mathematical operation, 
like adding 6, or multiplying by c, or extracting the third root, etc., can be 

^ This statement requires modification in some cases. See remarks concerning 
“continuous spectrum/’ sec. 11.9c. Condition (1) must be rigorously maintained 

without exception when Jdr is finite. It seems best to present the foundations of the 

theory with this restriction, leaving necessary generahzations for later. 
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represented by a characteristic symbol which is then called an operator. 

3^ d n’’ d 

Operators are: 6+, c-, , I di, A A- B- — 1- C, and so forth. 

dx %J ^ dx dx 


In general they act on functions. They can be applied in succession. 
When they are so applied, the order in which the operators occur is impor- 
tant. For convenience, let us use more general symbols for operators, such 
as P and Q. If P stands for a+ and Q for c*, then PQf means a + cf where 
/ is a function; however QPf means c(a + /). Thus 

QPf= PQf + (c - l)a (11^) 

Such an equation is said to be an operator equation. The reader will at 
once verify that, if P stands for d/dx and Q for x*, the operator equation 

PQf - QPf -f (11-5) 

holds. 

There is an important difference between eqs. (4) and (5); the second 
is homogeneous in /, the first is not. From the second, / may be canceled 
symbolically so that it reads 

PQ - QP = 1 (11-5) 

Only homogeneous operator equations of this kind, usually written in the 
latter form without explicit insertion of the operand /, are of interest in 
quantum mechanics. 

The formalism of operators is convenient also in other ways. It is 
possible, for instance, to define a periodic function </)(x) by writing 

4>{x) = <j){x + h) 

D being d/dx; for the left-hand side is, on expansion, simply the Taylor 
series for <t>(x -f- h). 

Two operators, P and Q, are said to commute when PQ — QP is zero. 
Thus c* and d/dx commute if c is a constant. Other examples of commut- 
ing operators are: x* and d/dy; d/dx and J* dx if a and h are constants; 

a+ and ( — 6). Clearly, every operator commutes with itself or any power 
of itself, provided that by the n-th power we mean the n-fold iteration of 
the operator. 

11.3. Postulates.^ — a. The fundamental postulates of quantum me- 
chanics are three in number. The first concerns the use of observables. 

^ Henceforth in the present section, and in all subsequent sections up to 11.25, 
states will be supposed to be independent of the time; i.e., <f> does not contain t. Such 
states are known as stationary ones, and the part of quantum mechanics dealing with 
them will be called quantum statics. In quantum dynamics^ introduced in sec. 11.25, a 
new postulate (Schrodinger’s “ time equation) will be needed. This postulate is not 
included in the present list. Nor do we include the Pauli principle, which is also of 
axiomatic status, and which will be presented in sec. 11.33. The present limitation is 
made for pedagogical reasons. 
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Brief reflection will show that classical physics associates with observables 
certain definite functions of suitable variables: x, y, z with position, mv 
with linear momentum, with kinetic energy, and so forth. These 
functions are chosen to describe experience most adequately. There is no 
logical reason which would exclude the use of more abstract mathematical 
entities in this association. It has indeed been found that, for the descrip- 
tion of atomic phenomena, certain operators should replace the functions 
which in classical mechanics represent observables. The first postulate 
may be stated as follows: 

To every observable there corresponds an operator. 

The correct operator to be associated with a given observable must be 
found by trial. In the following table we give a brief summary of the four 
most important operators of quantum mechanics; the observables in 
question are understood to refer to systems classically described as groups 
of mass points having 3n degrees of freedom {j = 1, 2,* • •,/ 2 ), subject to no 
external forces (total energy constant) and not requiring relativity treat- 
ment. The first column gives the name of the observable, the second its 
classical representation, the third its quantum mechanical representation. 


Cartesian coordinate 






Cartesian component of Px; * rijXj 

linear momentum of ^-th 
particle 

A'-component of angular - z^yj) 

momentum of j-th particle 


h 

i dXj 



Total energy 


i 23 — (Px) + pIj -f- p%) 

j ^7 

-f V(Xx • • • ^n) 


dy) dz]) 
V {xi ••• Zn) 


ruj is the mass of the j-th particle; h is an abbreviation for Planck *s 
constant, /i, divided by 27r. 

The operator form of the Cartesian coordinate x% is identical with its 
classical representation and has been included only for formal reasons. 
Linear momentum, a differential operator, is basic in the construction of 
the last two entries in the table. 

When the operator corresponding to the linear momentum p of a single 
particle is written in the vector form — iftV, those corresponding to angular 
momentum and energy of this particle may be constructed according 
to classical formulas: Angular momentum = r X p = -tftr X V; and 
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energy = {\l2m)p^ = — {fi^l2m)V^ + F. These vector forms are 

valid in all other systems of coordinates and should be used as the basis for 
transformations. 

In view of the table, the reader will easily verify the following operator 
equations : 

Let Qk stand for the operator k-th Cartesian coordinate, Pt for the 
fc-th component of linear momentum. Then 

PkQi — QiPk = —i^ki (ll~b) 

Also, if Lx, Lj, and denote the components of the angular momentum 
operator for a single particle,^ 

Ijx^y LyLx ifiLg 

LyL, - L^Ly = ihu (11-7) 

Lj/IjX LxL^ ““ iflLy 

Commutation rules, like (6) and (7), are often sufficient to define the 
operators involvexl without recourse to their explicit form, but the latter is 
usually helpful. 

b. The second postulate states: 

The only possible values which a measurement of the observabk whose 
operator is P can yield are the eigenvalues px Ibe equation 

P>h = ( 11 - 8 ) 

provided obeys conditions (1) and (2), namely: J \l/X}hdT < oo and^x is 
single- valued. 

The range of integration depends on the particular problem under con- 
sideration, as will be seen later. 

We illustrate the meaning of this postulate by a few examples. Let us 
find the measurable values of the linear momentum of a particle, known to 
be somewhere on the >Y-axis between the finite points x = a and x — b. 
The operator P is —ih(d/dx), Eq. (8) therefore becomes a first-order 
differential equation which can obviously be satisfied if \k\ is assumed to be 
a function of x only. It reads 

(11-9) 

ax 

and has the solution 

* Ly and L, may be obtained from L, in the table by cyclical permutation of coordi- 
nates. 
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Is this solution satisfactory from the point of view of eqs. (1) and (2)? 

It is certainly single-valued; moreover, J* = (6 — a)c*c is finite 

for every finite c. Hence no restriction upon p\ results; all values of the 
linear momentum may be found upon measurement. The eigenvalues of 
the linear momentum form a continuous spectrum (X is not a discrete 
index) and every function of the form with constant p is an 

eigenfunction. As far as measurable values of linear momentum are con- 
cerned, quantum mechanics leads to the same result as classical physics. 

This is not true for the angular momentum of a single particle. Here 
eq. (8) reads 

-ih y ^ V'x = Wxif-x (11-10) 


provided we consider the 2 -component and write m\ for the eigenvalues. 
Obviously, must be a function of both x and y. But a simple trans- 
formation of coordinates reduces the equation to a simpler form. On 
putting X — r cos 6 and y = r sin 6, we have 


dd 


d d d 

'T sm 6 h ^ cos 6 — = x — 

dx dy dy 



Therefore eq. (10) becomes 


^ = m^4'\ 


and ^ is seen to be a function of 6 alone. The solution is 

It certainly has an integrable square, because the range of 6 extends from 0 
to 27 r, or more exactly, from 27rn to 27r(n -f- 1), where n is an integer. But 
violates the condition of single-valuedness which must be imposed in 
the form (3). To satisfy it we must require that 

MO) = Me + 2 t) 

and this implies e^ 2 irt/A)mx ^ ^ 

m\ = X an integer (11-11) 

Hence the only observable values of the angular momentum are given by 
(11), and the eigenfunctions are This result is identical with the 

postulate of the older Bohr theory concerning angular momentum. 

Next we consider the possible values of the total energy of a single mass 
point. The energy operator appearing in the table is often referred to as 
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the Hamiltonian operator and is denoted by the symbol H. Let us use E\ 
for the eigenvalues. The operator equation then becomes 

+ V{x,y,z)\l/y, = (11-12) 

This equation, written perhaps more frequently in the form 

vVx + {E\ — F)^x == 0 (11-12) 

was found by Schrodinger and bears his name. Its solutions and eigen- 
values clearly depend on the functional nature of V{x,yjZ); they will be 
reserved for detailed consideration in secs. 9 et seq. 

A rather peculiar result is obtained when (8) is applied to the coordinate 
operator.’^ The eigenvalues of “ x are the values fx for which the 
equation 

X •\l/\ = {x^X 

an ordinary algebraic one, possesses solutions. On writing it in the form 

(x - = 0 

it is evident that either a: = ^x or ^x = 0. In plainer language, ^x as a 
function of x vanishes everywhere except at x = fx, a constant. From a 
rigorous mathematical point of view such a function is a monstrosity, but 
it is useful for certain purposes to introduce it, as Dirac^ has done. It is 
called 8{x — $x), the symbol being fashioned after the Kroneckcr 5, and is 
best visualized as something like lim For later use the con- 

a— M) 

stant c(a) will be so chosen that f d{x — ^)dx = 1, so that 



f{x)8{x - Odx = fiO 


(11-13) 


Now it is clear that such a function can be formed for every value fx, 
hence every point of the X-axis is an eigenvalue of the x-coordinate.® 

The significance of the second postulate is best grasped when it is 
regarded as furnishing a catalogue of the measurable values of all observa- 
bles for which operators are known. It implies no information concerning 
the meaning of the eigenfunctions ^x« These are, of course, states of the 

^ Dirac, P. A. M., “ Quantum Mechanics,” Second Edition, Oxford, p. 71. 

® The operator x- has a continuous spectrum. Correspondingly, the integral 

f iHx — ^)dx does not exist! See sec. 11.9c. 
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system in the sense explained. Their nature will unfold itself when the 
third postulate has been set forth. For the present we only note that 
every ^ is indeterminate with respect to a constant multiplier; eq. (8) 


will also be satisfied by constant • 


On the other hand, 



exists. We may require, therefore, that is normalized after the manner 
of sec. 8.2. Henceforth this will be assumed unless a statement to the con- 
trary is made. In this connection it may be recalled, however, that 
normalization may fail intrinsically when the eigenvalues p\ form a con- 
tinuous spectrum. In Chapter 8 this was shown to be the case in instances 
where the range of the fundamental variable became infinite. These 
require special treatment. 

The ^x will be orthogonal if operator and boundary conditions conform 
to the circumstances of the Sturm-Liouville theory (sec. 8.5). This theory, 
as will later be seen, covers most of the cases occurring in quantum mechan- 
ics, but must be generalized somewhat to be applicable to complex opera- 
tors. At any rate, it may help the reader to anticipate the orthogonality 
of all ypx associated with different eigenvalues. 


c. We turn to the third postulate which states: 

Whm a given system is in a state 0, the expected mean of a sequence of 
measurements on the observable whose operator is P is given by 


P = 



(11-14) 


The expected mean is defined as in statistics : If a large number of measure- 
ments is made on the system, and the measured values are pi,P 2 , • • •, Pn, 

N 

then p = pt. Note that eq. (14) does not predict the outcome of a 

% “1 

single measurement. 

In writing (14) we are again supposing that <i> is normalized. This can 
be brought about in all physical problems by “ confining the system in 

configuration space, that is, by taking r = J* dr to he finite. For infinite 

configuration space J* <t>*4>dr may still exist, in which case eq. (14) is 
proper. But if it fails to exist this equation should be replaced by 

r <^>*P</)dr 

p = lim 


n 


(11-14') 
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We illustrate the meaning of (14) by a few examples. Let a 
system having one degree of freedom be in a state described by 
<i> - Then the mean value of its position will be: 


its mean momentum ; 


its mean kinetic energy 

“ S. J ** * • J <•* * * * 2 ■ ta 


e J* 


dx - 0 


It is interesting to note that, the more concentrated the function (j> 
(the greater b) the larger will be the mean kinetic energy. To calculate 
the mean total energy we should have to know the form of V (x). 

Let us take = e*^/(6 — We then find 

h + a 


i = / <j^'x<t)dx = 

Px = J <t>4>'dx - kti 


Ekin 


2m Ja 


(jxjy dx - ~ — 
2m 


If in this example the range is extended to infinity, let us say in such a 
way that — n = 6— ► oo, the function can clearly not be normalized 
One must then use eq. (14') in the form 


p » lim 


J* <j>*P<l)dx 

J 4>*<t>dx 


which gives the same results as those obtained above. 

The three postulates here stated and exemplified do not reveal an 
intuitive meaning of the state function tp. It is therefore not unusual in 
textbooks on quantum mechanics to add another postulate stating that 
4>^[x)(pix) signifies the probability that the “ particle whose state is 
(p be found at the point x of configuration space (with suitable generaliza- 
tion for more than one degree of freedom). This is indeed true, and it may 
be well for the reader to form this basic conception ; but this statement is 
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not a further postulate since it may be deduced from those already given. 
(Cf. sec. 6.) 

DEDUCTIONS FROM THE POSTULATES 

11.4. Orthogonality and Completeness of Eigenfunctions. — In Chapter 
8, orthogonality and completeness of the eigenfunctions belonging to the 
Sturm-Liouville operator L have been discussed. The proofs there given 
need to be generalized if they are to be applied to quantum mechanics, for 
the operators occurring there are not all of the same structure as L. (One 
of the most important equations encountered, the one-dimensional Schro- 
dinger equation (12), is of the Sturm-Liouville type.) They often involve 
many variables, they may be differential operators of the first order, they 
may be complex; in fact they may not be differential operators at all. 
To simplify the theory we shall assume that the eigenvalues px of eq. (8) 
are discrete, and that the boundary conditions on acceptable state func- 
tions are of the form 1 and 2. Whenever convenient we shall even assume 
that <t> vanishes at the boundary of configuration space, over which inte- 
grations are to be carried out, in a manner suitable to our needs. Unless 
these restrictions are made the arguments become involved and in some 
respects problematic. It would then be necessary to conduct a separate 
proof for every problem of interest; thus elegance would fall prey to rigor. 

We first define what is meant by an Hermitian operator. Let u and v 
be two acceptable functions, defined over a certain range of configura- 
tion space T. We then say that the operator P is Hermitian if 

J u* • Pvdr ^ f V P*u*dT (11-15) 

All operators of interest in quantum mechanics have this property. As a 
sample proof we show this for the linear momentum Pj = —ih{d/dqj)f 
associated with the j-th Cartesian coordinate: 


f u^Pvdr = —ih f u* dr - —ih f u* dqi • • • dqn 
J J, dqj Jr dqj 

First perform the integration over gy, which yields 

— r vf^vdqi • • • dqj^idqjjLi • • • dqn + ih f v— u*dr 
Ja Jr oqj 


The first integral, a surface '' integral taken only over n — 1 coordi- 
nates but with u and v evaluated at the end points of the range for gy, 
will vanish provided u and v vanish sufficiently strongly for these extreme 
values of gy, which is what we are supposing. The remaining integral is 
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The Hermitian property of x* is obvious. To prove it for the Hamil- 
tonian two partial integrations are necessary; the details may be left 
as an exercise for the reader. 

Hermitian operators have real eigenvalues. This fact follows at once 
from eq. (15). The eigenvalues of P are defined by the equation 

P4^\ = Px'/'x (11-16) 

This also implies the validity of the equation 

PV? = (11-17) 

Now multiply (16) by yp* (17) by and integrate over dr obtaining 

J \p\Pyp\dT == P\ J* ^x^xdr 
J* ypxP*\l/tdT = Px J ^\^\dr 

By (15) the left-hand sides of these two equations are equal, for \j/\ is cer- 
tainly an acceptable function in the sense outlined before. Hence p* == px; 
i.e., px is real. Since the eigenvalues of operators are measurable values of 
observables, which must of necessity be real, the physical significance of an 
operator is assured when it has the Hermitian property. 

Let us again consider eq. (16). If is some other eigenfunction, it is 
evident that 

/ 'i'*P'h.dT = px J (11-18) 

But if we start with the equation 

^* 1 ^: = 

which is true because is real, we also conclude that 

J^xPWr = p,J,^:^xdr (12-19) 

Combining (18) and (19) wo find 

J r.P^xdr - f ^xPVJdr = (px - p,) f 

If P is Hermitian the left-hand side vanishes. Hence either px = Pm 

J* = 0. We see that eigenfunctions of Hermitian operators, belong- 

ing to different eigenvalues, are orthogonal. 

The completeness of the eigenfunctions of all operators employed in 
quantum mechanics is usually assumed. To the authors' knowledge, a 
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rigorous proof has not been given. Since, however, our main interest will 
be in the Schrodinger equation which is of the Sturm-Liouville type, this 
point need not detain us further. In the following we shall assume com- 
pleteness of all ^ whenever this property is needed. 

Problem. Show that the angular momentum operator L, « —ih{d/Bd) is Hermitian. 

11.6. Relative Frequencies of Measured Values. — Important conse- 
quences can now be deduced from the third postulate, eq. (14). We first 
note that, if P is Hermitian, every power of P is Hermitian. Moreover, if 
(14) is true for every operator P, it must certainly hold for the operator 
P**. It implies, therefore, 

7 = J* <t>*P'<t>dr, r = 1, 2, • • • (11-20) 

The left-hand side stands, of course, for the r-th moment of the statisti- 
cal aggregate of the measured values, i.e., 

( 11 - 21 ) 

t 

provided p* is the relative frequency of the occurrence of the i-th eigen- 
value Pi in the set of measurements. In accordance with eq. (20), the 
state function </> predicts not only the mean, but all moments of the aggre- 
gate of measurements.^ Now eq. (20) may be transformed as follows. 
Let the eigenfunctions of P be denoted by so that P^x == px^x- On 
allowing P to operate on both sides of this equation, there results 
^ pxP^x = Px^x- By continuing this process, the relation 

P"^ = Px^ (11-22) 

is established. If the function 0 appearing in (20) is expanded in terms 
of the 

% 

and this series is substituted, we find 

p'' = f T.Oiaji'iP'''l'}dr = 

U ij ij 

= E a*a<P\ 

% 

by virtue- of (22) and the orthogonality of the Comparing this with 
(21) it is clear that 

Zp.-P< = Z1 a,- 1 V< 


^ For terminology, see sec. 12.3. 
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for every integer r. But this can be true only if 

Pi = 1 a,- 1“ (11-23) 

In words : when the system is in the state 0, a measurement of the observa- 
ble corresponding to P will yield the value with a probability (relative 
frequency) [ a» being the coefficient of rf/i in the expansion <t> = 

X 

and is one of the eigenfunctions of P, The coefficients a» are called 
'probability amplitudes. 

They may be expressed in terms of <t> and by the relation 

j* \l/f<tydT = y' ^ia\\l/\dr = Oi (11-24) 

Consequently, eq. (23) may also be written 

P. = 1 / 4>l4>dT |2 (11-25) 

An interesting result is obtained when, in this equation, we let </> be 
one of the eigenfunctions belonging to the operator P itself, e.g., It 
then reads 

Pi = I J 1^ = 6ij 

All relative frequencies are zero except the one measuring the occurrence 
of the eigenvalue pj, which is unity. Thus we conclude that an eigenstate 
\l/j of an operator P is a state in which the system yields with certainty the 
value pj when the observable corresponding to P is measured. Eigen- 
functions are simply state functions of this determinate character. 

11.6. Intuitive Meaning of a State Function. — Consider now a system, 
like a simple mass point with one degree of freedom, whose state function is 
We wish to know the probability that a measurement of its position 
will give the value x = ^. The eigenfunction corresponding to the opera- 
tor X* for the value { has been shown to be 

= 5(x ~ {) 

Eq. (25) now reads 

Pf = I JKx- ^)Hx)dx 1* = 1 <^.({) |» (11-26) 

by virtue of (13). The probability (density) of finding the system at $ is 
given by the square of its state function. This fact provides a simple 
intuitive meaning for the state function. It can be generalized to several 
dimensions. 
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Let qiy q 2 i • • 'y qn be the coordinates on which <t> depends. Using the 
former arguments, the eigenfunction corresponding to the composite coordi- 
nate operator qvq 2 * • * • 5n inay be shown to be 

~ “ ^ 2 ) * • • — ^n) (11-27) 

If, therefore, we wish to find the probability ... of finding the system 
at the point (^ 1^2 * * * fn) of configuration space, we must use eq. (25) with 
ij/i replaced by (27). Hence 

1 / I f 3(?1 - fi) • • • ^(Qn - in)4>{q\q2 • • • qn)dq^dq2 • • • dq^ 

11.7. Commuting Operators. — Let P and R be two operators satisfying 
the relation PR — RP = 0, and let their eigenfunctions be and x^y 
that is ^ 

P^ = P\^y Rxn = r^XtJL (11-28) 

We assume the state function to be so that, when P is measured, there 
results with certainty the value But 

RPipi = PRxpi = PiRypi 

Considering only the last two members of this equation, we may say that 
(Ryj/i) is an eigenfunction of P, namely that belonging to the eigenvalue p,-. 
But this is possible only if Rypi = const, Comparison with the second 
equation (28) shows the constant to be one of the r^, and to be one of the 
eigenfunctions x^- We conclude that commuting operators have simul- 
taneous eigenstates; i.e., measurements on their observables yield definite 
values for both; they do not spread.^' 

The fact that, when P and Q are non-commuting operators and the 
state of the system is an eigenstate of P, measurements on Q will give a 
statistical aggregate of values and not a single one with certainty, is usually 
attributed to the interference of measuring devices. For instance, the 
measurement of a particle^s position disturbs its momentum, and vice 
versa, so that when one is ascertained with precision, the other quantity 
loses it. From this point of view, measurements on the observables asso- 
ciated with commuting operators are said to be compatible^ the procedures 
of measurement do not conflict with each other. 

11.8. Uncertainty Relation. — ^The proof of the famous Heisenberg 
uncertainty principle which will now be given requires the use of an inequal- 
ity, similar to a well known relation due to Schwarz, though not identical 
with it. (Cf. eq. 3-112.) It states: if u and v are any two “ acceptable 
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functions in the sense specified in connection with the definition of Hermi- 
tian operators (sec. 11.4), then 


J u*vdT ■ J v*vdT ^ i J' (w*!' + v*u)dT (11-29) 


We assume a system to be in a state </>, which need not be an eigenstate 
of any particular operator, and we are interested in the results of measure- 
ments on the observables belonging to two operators, P and Q, at present 
unspecified. Introduce into eq. (29) the following functions 

u = (P — p)(t) and V = i(Q — g)0 

where p and q are mean values associated with P and Q through the rela- 
tion (14). Eq. (29) then reads 

J (P - p)U*(P - p)<l>dT ■ f (Q- q)*<t>*(Q - g)«^r > 

f (P- p)*<t>*(Q - q)<t>dr ’q)*,t>*{P “ 

Now P and Q are Hernriitian and satisfy eq. (15); p and q are constants. 
Therefore the inequality reduces to 

J <t>*(P - p)Vr • - ~q?<t>dT ^ -?[/ <fiPQ - QPW^ 

(11-30) 

Let us consider the meaning of the quantity J <I>*{P — p)^(t>dT, When 
0 is expanded in eigenfunctions of F, </> = and the expansion is 

X 

introduced in the integral, the result is 2I| l^(Px — and this, in view 

X 

of eq. (23), is nothing other than the dispersion^ of the statistical aggregate 
of p-measurements about their mean. For this quantity we may intro- 
duce the more familiar symbol Ap^. A similar identification is to be made 

for J' (t>*{Q — Inequality (30) then takes the more interesting 

form 

Ip - ^ <t>*{PQ - QP)<#^tJ (11-31) 


® The “dispersion ” is the square of the so-called “ standard deviation. It is an 
index of the “ spread " of the measurements. See Chapter 12. 
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Now if P and Q commute, the right-hand side is zero, and it is possible for 
or to be zero, or even for both to vanish. This state of affairs 
recalls the result of sec. 7, which was that both p- and g-measurements 
could yield single values without spread. 

When P and Q do not commute, relation (31) sets a lower limit for the 
product of the dispersions, often called uncertainties. Suppose, for 
instance, that P is the operator —ih(d/dq)j the linear momentum associ- 
ated with g, and Q stands for the coordinate g. We then have 

PQ ^ QP = if^ (11-32) 

When this is put into (31) the result is Ap^ • Ag^ ^ or, written in 
terms of standard deviations, 5p and 5g, 

6p • 5g ^ “ (11-33) 


This is Heisenberg’s uncertainty relation. 

Our result need not be cast in the form of an inequality. It is indeed 
quite possible to calculate both 5p and 5g separately and exactly when the 
state function <t> is given, as the postulates show. 

A slight generalization of the present conclusions is also possible. 
There are other operators, such as Lg and 6 (cf. eq. 10 et seq.) which also 
obey eq. (32). In fact all quantities which are called canonically conju- 
gate in classical physics® have operators which satisfy it. (Later we 
shall see that energy and time belong to this class.) For all these, the 
uncertainty relation in the form (33) is valid. 

Problem. Show that, if the state function <f> is an eigenfunction of the operator Mg 
corresponding to the eigenvalue m,, the product of Svig and bmy is at least as great as 

{h/2)mg. 


SCHRODINGER EQUATIONS 

Attention will now be given to the eigenvalues and eigenfunctions of 
the energy operator, that is, to the solutions of the various forms of the 
Schrodinger equations, eq. (12). 

11.9. Free Mass Point. — ^The simplest example of a physical system 
is the free mass point for which the potential energy V may be taken to be 
zero. In that case eq. (12) reads 

VV + JfcV = 0 (11-34) 

provided we omit the subscript X and write = 2mElh^, This quantity 
has a rather simple classical significance which it is well to recognize at 

® Cf. sec. 9.2. 



335 


FREE MASS POINT 


11.9 


once. For if 5 is the total energy of the particle, which is in this case 
purely kinetic, then E = = p^/2m. Hence k ^ plh, p being the 

classical momentum of the particle. Note also that k has the dimension 
of a reciprocal length. 

Eq. (34) has already been solved in Chapter 7 (cf. eq. 7-33), where it 
appeared as the space form of the wave equation. To select the proper 
solution, we must consider the fundamental domain, t, of our problem. 
Here, a great number of possibilities present themselves. 

a. Enclosure is a Parallelepiped. If the particle is known to be within 
a parallelepiped of side lengths li, h, and h, then t is this volume of space. 
Moreover, since | ^{xyz) has already been identified as the probability 
of finding the particle at the point x, y, z, this quantity must certainly be 
zero everywhere outside t. For reasons of continuity (which can, by more 
expanded arguments, be shown to result from our axioms) we require that 
I ^ l^, and hence ^ itself, shall vanish on the boundaries of t also. In view 
of this boundary condition, the solution of (34) in rectangular coordinates, 
namely eq. 7-36, must be chosen. In more explicit form it reads 

= A;? + ^2 "h ^3 

The origin of the parallelepiped may be taken in one comer. Vanishing 
of 4/ at the boundary then requires: 

A, + B, = 0, A, = 0, ,<! = 1, 2, 3 

The first condition makes each parenthesis of ^ a sine-function ; the second 
implies 


where is an integer. Hence 


and 


so that 


2 /n? ni ni\ 


( 11 - 36 ) 
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If ^ is to be normalized, 


I 


yp'^yl/dxdydz 


1, and the constant c has the value 


c = 




The permitted energy values form a denumerably infinite set. Their 
arrangement is best represented by constructing a lattice of points filling 
all space, with the reciprocal parallelepiped of sides 1/ii, l/i 2 , l/h as 
crystallographic unit. If from a given point lines are drawn to all other 
points, the squares of the lengths of these lines (multiplied by 7r^h^l2m) 
are the energies of our problem. However, not all these lines represent 
different states. The function ip changes only its sign when one of the 
integers ni,n 2 or us changes sign; it is not thereby converted into a new, 
linearly independent function. Hence only the lines lying in one octant 
of the lattice, with the origin of the lines at one corner, will represent 
different states. If some of the Vs are equal there will be degeneracy (cf. 
sec. 8.6), for then an interchange of the corresponding n^s will not produce 
a different Ej while xp will be changed into a function which is linearly inde- 
pendent from the original one. 


Problem. Obtain eigenfunctions and eigenvalues of the free particle in one and two 
dimensions, requiring the state functions to vanish at the end points (boundary of 
parallelepiped). Show that, in the one-dimensional case, the ^-function represents a 
standing wave of wave length X = 2wh/p = h/p (De Broglie’s relation). 

b. Enclosure is a Sphere, Eq. (34) must now be solved in spherical 
coordinates. But this has already been done in sec. 8.4 (cf. eq. 8-25), 
for an acoustical problem. The eigenfunctions are, aside from a normaliz- 
ing factor. 

The permitted energies are determined by the condition 

J i+i/2ika) = 0 

where a is the radius of the enclosure. For any integer I, there will be an 
infinite set of roots of J 1 + 1/2 which we shall label rj„, n = 1, 2, • • •, 00 . The 
permitted k’s are therefore 



a 


and hence E, which will also depend on two indices (quantum numbers) 
is given by 


2ma^ 


(rin)^ 


Eln — 



337 


ONE-DIMENSIONAL BARRIER PROBLEMS 


11.10 


c. No Enclosure, When the particle is allowed to exist anywhere in 
space, the former boundary conditions need not be applied. The simplest 
way to treat this case is to return to case (a) and permit Zi, Z 2 , and Z 3 to 
become infinite. Let us first consider the eigenvalues. The lattice of 
points will condense as the Z’s increase, until finally it forms a continuum; 
the energy states (lengths of connecting lines squared) will also move closer 
and closer together until finally all (positive) energies are permitted. A 
similar effect may be brought about by increasing the mass of the particle, 
as a glance at eq. (36) will show. Quantum mechanics indicates no quanti- 
zation of the energy for particles which are not restricted in their motion, or 
which have an infinite mass. 

What happens to the i/^-function, (35), as the Vs increase? Clearly, the 
normalizing constant c tends to zero, causing \f/ also to vanish. The mean- 
ing of this is quite simple: As the space in which the mass point moves 
increases indefinitely, the chance of finding it at a given point, ] ^{x^y,z) |^, 
approaches zero. The failure of the normalization rule is therefore not 
merely a mathematical phenomenon, but physically reasonable. To cir- 
cumvent it, several procedures may be employed. One is to suppose that 
there is an infinite number of particles in all space, N per unit volume, and 


accordingly to put 



taken over a unit of volume, equal to N, 


This leaves c finite. 

When there are no boundary conditions the ^-function need not be 
written as a product of sines. In fact in the absence of an enclosure sine, 
cosine and exponential functions are equally acceptable. Hence we may, 
if we desire, write 


using the notation explained in connection with eq. (38) of Chapter 7. 

Problem. Calculate eigenfunctions and eigenvalues of a free particle enclosed in 
a cylinder of radius a and length d, obtaining 

^ = c sin 'Tmioip) 

where aa is a root of Jmt ^2 

11.10. One-Dimensional Barrier Problems. — For a one-dimensional 
problem the Schrodinger equation is 

Another procedure is discussed for instance in Sommerfeld, A., Atombau und 
Spektrallinien,^^ Vol. II. 
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Let us take F to be the step function given by the solid line in Fig. 1, 
that is: F = 0 if x<. 0, F = F => constant if a: > 0. The solutions for 
the two regions are easily written down: 


with 


X < 0 (left of 0) 

= Are^ + x > 0 (right of 0) 


, y/2mE 

= — T — and kr 


V2m{E - F) 
h 


V « const. 



But how are they to be joined? The differential equation tells us that 
suffers a finite discontinuity as we pass across the discontinuity in V. The 
increase in in crossing the origin will be 

lim C \l/''dx = lim { (xf/'/ + = 0 

K ) J — { {->0 

Hence \p' (and a fortiori xf/) remains continuous at the origin. The con- 
stants A and B must therefore be fixed by requiring 

h(0)=M0); ^/( 0 )=^;( 0 ) 


In addition to these two we have an equation expressing normalization, 
three relations in all. However, there are four constants {AiyAryBi^Br) to 
be determined. The mathematical situation is therefore such that one of 
them may be chosen at will. Let us then put Br equal to zero. The physi- 
cal meaning of this will at once be clear. 

On applying the continuity conditions we have 

Ai + Bi ^ Ari ki{Ai — Bi) = krAr 

whence 


B = 


ki — kr 
kl + kr 


Ai 


(11-37) 
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The coefficients A and B have a simple significance. Let ns analyze 
from our fundamental point of view a state fimction of the form 
- A In view of the third postulate (eq. 14') it represents 
a mean momentum 

J* yl/^yj/^dx 
p = —ih 

J 4'*fdx 

and a mean square momentum 

p2 = 

J >P*^dx 

We have intentionally left the limits of integration indefinite. In evalu- 
ating the integrals occurring here we assume that the range of integration is 
very much larger than the wave length of the particles, 27r/fc. The inte- 
gral over the last two terms of A* A + B*B + AB*e^''^ + 

A*Be'~^^^^ will then vanish, and 

J ri'cLc = (U |2 + 1 B |2)J 

I being the range of integration. By a similar procedure, 

4'*^'dx = iA:(l 4 - 1 B and J 4'*4'"dx = -fc^d A\^ + \ B p)/ 

Hence 

P = kh j - ^ j^ - , while p^ = 

It will also be observed that ^ is an eigenstate of the operator 
d 

but not of —ih — . 

dx 

Translated into particle language, this state of affairs must be expressed 
as follows. Since all particles have a root mean square momentum of 
magnitude kh^ and yet the mean momentum along x is smaller than fcft, 
some of them must be traveling to the right, others to the left, with momen- 
tum kh. If a fraction a travels to the right and to the left, 

(a — fi)kh = p, + fi)kh = 
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whence 




B [2 
■ A 1 ^ 


In our problem, /3/a is the reflection coefficient of the barrier of potential 
energy V. In view of eq. (37) it is given by 

p_ 

1 Az + fc, 1^ 


Two cases of interest may be distinguished, (a) E < Vj (b) E > V. In 
classical mechanics, a particle would certainly be reflected in case a, 
{R = l)j certainly transmitted in case b, (/? = 0). The matter is not 
quite so simple in quantum mechanics. In case a, ki is real but kr is 
imaginary. R is thus always 1 in agreement with the classical prediction. 
But in case b both ki and kr are real, and R < 1 but not zero. Hence 
every potential barrier reflects particles, even though classically one would 
expect them to be only retarded. 

Before leaving this matter, we must justify the procedure of setting Br 
equal to zero. This is now seen to mean omission of a beam of particles 
travelling to the left in the region to the right of the origin. Had such a 
beam been included, the physical condition corresponding to xp would have 
implied the incidence of two beams of particles upon the origin, one from 
the left and one from the right. In that case, fi/a is not the reflection 
coefficient of the barrier. The ^-function we have chosen permits that 
interpretation, for it corresponds to one beam incident from the left, one 
reflected and one transmitted beam. 


Problem. Prove that p is the same whether it is computed to the left or to the right 
of the origin [use conditions (37)]. 

A study of more complicated barriers, such as that depicted in Fig. 2, 
reveals a new and striking feature: the “ tunnel effect.” The energy E 
of the incident particles is assumed to be greater than V\ and F3, but 
smaller than F2, so that from the classical point of view every particle 
would certainly be reflected. If we define 


2m 

-tj(e 


— IV 
*2 ('^2 


2m 

1* 


the ^-functions for the three regions are 

+ Bie-*'***, a: < 0 
1^2 = ^26“ + 526—*, 0 < X < o 

ia = Aae*'^^, x> a 
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The continuity conditions for rj/ and ^1/' at both a: = 0 and x = a are 
seen to be: 


all + = A2 + 

iki{Ai — Bi) = *(^2 — B 2 ) 
A 26 ““ + 52e~‘“ = A3e<*=>“ 
K(A2e“’ - 526-*“) = ikaAze'’^*^ 


Vi 


E 


I 

I 

I 

I 


Vi 


Vr 




a 




Fig. 11-2 


From these, Bi, A 2 , and B 2 may be eliminated. When this is done we 
obtain the relation 


A, - j(l + |)co.h ,« + 


sinh Ka 


(11-38) 


An argument similar to that which led us to identify the reflection 
coefficient R with | 5 p/1 A |^, shows the transmission coefficient of the 
present barrier to be 


A3 Pfcs 
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This may be computed from (38). In doing so we assume that ko >■ 1 
so that both cosh na and sinh m become f e'“. Then 

T 

(A:i + fc3)^+(/c-^y 

As the width of the barrier increases, the factor (sometimes called 
the “ transparency factor ”) rapidly diminishes. 

The surprising fact is that particles are able to “ tunnel ” through the 
barrier although their kinetic energy is not great enough to allow them to 
pass it. Classically speaking, the kinetic energy of a particle would be 



negative while it is in region 2. Quantum mechanically, this statement is 
devoid of meaning, since it is improper to compute E — V for this region 
alone.** 

Fig. 3 gives a qualitative plot of the (real part of the) i('-function in 
the three regions here considered. It is seen that the barrier attenuates 
the wave coming from the left, permitting a fraction of its amplitude to 
pass out at o. The situation is quite analogous to the passage of a wave 
through an absorbing layer. 

^Simple Harmonic Oscillator. — The potential energy, usually 
expressed in the form \kx^, is when written in terms of the mass m 

and the classical frequency w = 2tcv of the oscillator. The meaning of w is 

" More complicated barriers are discussed by Condon, E. U., Rev. Mod Phys 3 
43 (1931 ); Eckart, C., Phys. Rev. 36, 1303 (1930). ’ 
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simply that of a parameter appearing in F ; we must no longer expect the 
oscillator to go back and forth a)/27r times p)er second. The Schrodinger 
equation is 

^ + (* - = 0 (11-39) 


if we use the abbreviations 


2m£ mu) 



The substitution { = \^x reduces (39) to the form of the differential 
equation for “ Hermite's orthogonal functions/’ 




1 



0 


which was studied in Chapter 2 (cf. eq. 2-66). It was there found that its 
solution is of the form H(^) being a solution of Hermite’s 

equation (2-62). Now H{^) is a polynomial if the quantity a, which corre- 
sponds to the present ^{e/0 — 1), is an integer. Unless this is true, II is a 
superposition of the infinite sequences (2-63) and (2-64). But both of 
these approach infinity like , as closer inspection will show. If they are 
multiplied by they will not yield a ^-function which has an integra- 
ble square between the limits -- oo and + qo , which we are here assuming 
to exist. Hence H(^) must be chosen in its polynomial form, 

Also, ^(e/P — 1) = n, and this leads to 

En == (r + ^)^ = (n + ^)hv (11-40) 

= ce (11^1) 

If the oscillator has three degrees of freedom, the Schrodinger equation is 

vV + - /5V)^ = 0 

when the same abbreviations as above are used. The method of separa- 
tion of variables (Chapter 7) which involves the substitution of X{x) • 
Y(y) • Z{z) for yp at once reduces this partial differential equation to three 
ordinary ones 

X" + (6, - |3V)X = 0, Y" + (t2 - fiY)Y = 0 
Z” + (es - |9V)Z = 0 

provided that «i + «2 + *3 = «• Each of these has a solution of the form 
(41), so that 

(11-42) 
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and 

^niTi^n^ (ni + ^2 + ^3 + 1-)^ (11-43) 

The orthogonality of the functions (41) has been proved in eq. 3-92. 
From this formula, the normalizing constant c may also be computed. 
For if 

f Hl{y/px)dx f 

^ a> ^ ^ CO 


then 


= c2 



1 


c = 



(n! 


A similar computation, which involves three integrations, yields for the 
constant c of eq. (42) the value 



(ni!n2!n3!2”''^”2+”»)~^^^ 


Further mathematical details concerning the functions here encoun- 
tered, as well as a table of the /fn-polynomials, are given in sec. 3.10. 


Problem, The treatment above implied that the 3-dimensional oscillator was 
isotropic; i.e., bound with equal forces in all directions. Calculate eigenvalues and 
eigenfunctions for an anisotropic oscillator with potential energy 

V = + o)\y^ + 

11.12. Rigid Rotator, Eigenvalues and Eigenfunctions of — k rigid 
rotator is a pair of point masses held together by a rigid, inflexible and 
inextensible (massless) bond. A diatomic molecule is a fair approxima- 
tion to a rigid rotator. Before attempting to solve the Schrodinger equa- 
tion for such a system it is well to digress briefly and consider the eigen- 
value equation for an operator which so far we have not introduced, 
but which is easily constructed. We have seen that the operators corre- 
sponding to the components of angular momentum of a particle are 



(11-44) 
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From these, we wish to construct the operator 

= Ll + Ll + Ll 

It is advantageous to do this in polar (spherical) coordinates. 
X = r Bind cos (pj'ij = r sm 6 sin (pjZ = r cos 6, we have 

d . d 1 d 1 sin <p d 

— = sm B cos ip 1 — cos B cos ip :: 

dx dr r dB r sin B dip 

d . . d , 1 . d 1 cos tp d 

— = sm d sin ^ 1 — cos B sm ip 1 : 

dy dr r dB r sm B dtp 

— = cos d sm d 

dz dr r dB 


(11-45) 

Putting 


When these results are introduced in (44) and (45) is formed, there results 

|J_ 1 

Isin d dd V dd) sin^ d 


= -fe 


(11-46) 


The observable values which the square of the angular momentum may 
assume are the eigenvalues p of the equation 

LV = Pi' (10-47) 


This equation is easily solved by the method of separation of variables 
(cf. Chapter 7). Clearly, ^ is a function of d and ip. Put ^ = 0(d) • <^(v?) 
into (47). This equation will then break up into two ordinary equations 
(the process is analogous to the construction of eqs. 7-42a and 7-42b) : 


Id,. - 

- ,(sm d0 ) - — 


sin d dd ■ 




sm 


;e-h^3e[ = o 


The quantity m must be an integer to insure single-valuedness of <I>. 
The second equation therefore has the solution $ = const, m an 
integer. The first is the equation for associated Legendre functions, 
(eq. 7.41), except that the constant l(l + 1) appearing there is here 
replaced by p/h^» The solution previously obtained is 

0 = sin"^ d — — Pi (cos d) 

d (cos BY^ ^ ' 


Now the Legendre function Pi was shown to behave singularly at 
cos d = dbl unless I is an integer, in fact it would contain unlimited powers 
of x(== cos d). The same would be true for 0 if I were arbitrary. But in 
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that case 



which contains the factor 


/' 


8^ sin ddB 



would certainly not exist. We conclude, therefore, that I must be an 
integer, and that the eigenvalues of are 

p = + l)ft^ 

On the other hand, the eigenfunctions of are of the form^ 

dT 

sin”^ 0-^Pi (cos = PP (cos (11-48) 

du 


in the notation adopted in Chapter 3 (cf. eq. 3-43). Since the eigenvalue p 
does not depend on m but only on i, functions like (48) with different m 
will satisfy eq. (47). The most general solution of that equation is there- 
fore,^^ 

^ = L cmPT (cos (11-49) 

m~ —l 

In Chapter 7 this function has already been encountered; it is called a 
spherical harmonic and denoted by Yi(9,<p) (cf. eq. 7-43 et seq.). Hence 

lA = Yi(e,,p) (11-50) 

Since dr = sin $ddd<p, normalization requires that 

IT ^ 2w 

sin OdO I = 1 

0 


When (49) is inserted the integral becomes 

2^ £ c*c,„ f [Pr{x)]^ch = L 1 TT^ 

-I 21 + I -I (I — m)\ 

(cf. eq. 3-62). Hence, for normalization, the constants Cm appearing in 
(49) must satisfy the relation 

4, I |j(J + m)! 21 + 1 

I'-' — 

and are otherwise arbitrary. 

We are now ready to return to the problem of the rigid rotator. In 
the first place, we shall assume it proper to replace it by a single mass, 
rigidly tied to a center of rotation, and having the same moment of inertia 

We define here and elsewhere: PJ^ = Pf*, as in (3~62). 
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as the original system. The condition upon the state function in accord 
with this assumption — aside from single- valuedness — is simply r = a, 
a constant. The best procedure is therefore to write down the Schro- 
dinger equation for a particle moving in three dimensions, and then to 
put r = a, d\l//dr = 0. This requires the use of polar (spherical) coordi- 
nates. The potential energy T, in this case, is clearly constant and may 
be taken to be zero. 

Schrodinger^s equation reads (cf. Chapter 5 for transformation of the 
Laplacian)* 


T^dr\ dr) 


+ 


1 d 
sin 6 dd 



+ 




r" sin 


20 V 


0 (11-51) 


When r is put equal to a the first term on the left vanishes, and the 
remainder becomes very similar to L^\l/. Indeed if we introduce, for con- 
venience here and later, a new operator defined as — (l/ft^)L^, eq. (51) 
may be written 


aV = 


2Ma^ 




(11-52) 


But the eigenvalues of A^ are obviously —I {I -f- 1), and its eigenfunc- 
tions are the same as those of L^. The constant — {2Ma^/h?)Ey which 
multiplies yp on the right of (52), must be identified with — i(J -f 1). 
Hence the eigenvalues of the rotator are 


and its eigenfunctions 


E = 


2Ma^ 


1 ( 1 + 1 ) 


^l,m = yi(^)<p) 


(11-53) 


11.13. Motion in a Central Field. — By central field is meant a field of 
force in which the potential energy is a function of r only; V is independent 
of 6 and <p. The isotropic three-dimensional oscillator treated in sec. 11 
is an example of motion in a central field. Another is the motion of a 
particle in a Coulomb field. It is to this last example, an electron attracted 
by a positive point charge (hydrogen atom), that we shall chiefly direct our 
attention. But before considering this specific case a few general features 
of the central field problem will be exposed. 

It is now clear that the Laplacian, V^, in spherical polar coordinates 
has the form 



’^To avoid confusion, we write M for the electron mass in this section, returning 
to the symbol m in the next. 
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where is given by (46) divided by — The eigenvalues of are 
— i(i + 1). The Schrodinger equation therefore reads 



+ — [E^V{T)]rP^0 


(11-55) 


We write ^ as a product of a function R(r) and another, A(0j<p)y which 
depends only on the angles. The operator A^ acts only on A. Eq. (55), 
after multiplication by and subsequent division by f2 • A, has the form 


£ 

dr 



+ 



y(r)] = 


A^A 

A 


(11-56) 


The left-hand side of this equation is a function of r alone, the right a 
function of 6 and (p. By the argument which is familiar from Chapter 7, 
each side must be a constant, say a. Thus 

A^A = — aA 


But this is simply the eigenvalue equation for A^. We see, then, that 
a = l(l + 1), and A = Yi(dj<p) 

The left-hand side of (56) becomes 


£ 

dr 




and the substitution U (r) = rR(r) reduces this to 

u" + - V(.r) - U - 0 (n-57b) 

The development so far has been totally independent of the form of F, 
except in assuming it to be a function of r. The results obtained are there- 
fore valid for any central field. Summarizing them, we may say: 

The energy states of a particle in a central field are always of the form 

^ = rUi{r)Yi{d,<p) 

and the function Ui is determined by eq. (57b). It was necessary to add a 
subscript I to U because the differential equation contains J as a parame- 
ter. The energies E are obtained solely from eq. (57b). 

That equation looks very much like the one-dimensional Schrodinger 
equation, 

2777 


(11-58) 
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but with the term I (I + l)h^l2mr^ added to the normal potential energy. 
What is the meaning of that term? In classical mechanics, the energy of a 
particle moving in three dimensions differs from that of a one-dimensional 
particle by the kinetic energy of rotation, This is precisely the 

quantity l(l + l)h^l2mr^, for we have seen that l(l + l)h^ is the certain 
value of the square of the angular momentum for the state F;, in classical 
language (mr^w)^, which when divided by 2?nr^, gives exactly the kinetic 
energy of rotation. 

There is, however, one further difference between (57b) and (58). 
The fundamental range of r in (57b) starts at r = 0 and is limited to 
positive values, whereas the range of x in (58) may include negative values. 
This fact often has a more important effect on the eigenvalues than the 
addition of the terms just mentioned. 

Let us now solve eq. (57b), assuming a Coulomb field, e.g., V (r) = —e^jr. 
The energies E will then be the energy levels of the hydrogen atom.^^ For 
sufiSiciently large r the solution is determined by 

U" - (5)‘t/ - 0 

provided we define 

/aV _ 2mE 

\ 2 ) " 


(11-59) 

(11-60) 


The solution of (59) is and this represents 

the behavior of the correct C7 at oo . Let us first suppose that a is real, 
which means that the energy of the particle is negative. U will then cer- 
tainly not have an integrable square (note that the radial integral has the 


form R^r^dr = J* U^dr) if the coefficient c\ fails to vanish. But we 

cannot simply put it equal to zero because we have boundary conditions to 
fulfill! Without going further in our analysis at the moment we expect, 
therefore, that only special values of a will produce acceptable solutions 
when a is real. If the total energy of the particle is negative (classically 
speaking, the particle is bound to the attracting center), the energy is 
expected to be quantized. The following analysis will bear this out. 

If a is imaginary, which means that E is positive, C7co shows sinusoidal 
behavior. It has, in fact, the typical form of the state function for a free 
particle, and the failure of normalization occurs in the milder manner 
which we have previously found associated with the presence of a continu- 
ous spectrum of eigenvalues. There is indeed no way of* choosing ci or C 2 


If is replaced by Ze^, Z = 2 represents ionized helium, Z = 3 doubly ionized 
lithium, etc. 
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or a which would make one Uoa more acceptable than another. We con- 
clude that, when E is positive, the energy spectrum is continuous. 

From the point of view of classical physics this result is welcome, for 
when E is positive the particle is ionized and moves through space, its 
energy being unrestricted. 

We now discuss the bound states in a more rigorous manner. Put 
B == ~W, so that W is positive. Our interest will now return to eq. (57a) 
which forms a more suitable basis for the present discussion. Let r = x/a, 
where a is defined by (60). Eq. (57a) then reads, after some cancellation, 


^ dx^ dx ^ h^a 4 


l{l + 


r 


(11-61) 


But this is precisely the differential equation for associated Laguerre 
functions, which was studied in Chapter 2 (cf. eq. 71). For our immediate 
purpose we shall write that equation with n* in place of n, since otherwise 
our notation would be in conflict with physical convention. To summarize 
the results of sec. 2.16: 

The equation 

xy" + 2y' + j^n* - ^ 3 / = 0 (11-62) 

has a solution possessing an integrable square^^ of the form 

y = Lt^(x) (11-63) 


provided n* and k are positive integers. Moreover, n’‘‘ — A: ^ 0 since 
otherwise would vanish. 

On comparing (61) and (62) we find, in the first place, that {k^ — l)/4 
= i{l + 1), hence 


Secondly, 


k ^21+ \ 



2me^ 


When the value of a is inserted here and the relation is solved for TF, 
we find 

w 

2 (n* ~ l)^h^ 

Because of the conditions on n* and A, the quantity n* — I cannot be 
zero. It is usually denoted by n and called the total quantum number 
(after the role it played in the Bohr theory). Our conclusion, then, is this: 


The reader should cbnvince himself of this fact by going back to sec. 2.16. 
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The energy states of the hydrogen atom are 

1 me^ 


Wn - -En 


2 


( 11 - 64 ) 


and the corresponding eigenfunctions are, in accordance with (63), 

ftn.i = Cn,ie-^^WJtl{x) ( 11 - 65 ) 

the variable x being defined by 

\/SmW 2me^ 

X = ar - - r = r 

In the Bohr theory of hydrogen, the first orbit has a radius 

oo = — o = 0.53 X 10 ® cm. 
me 


It is sometimes convenient to express x in terms of it. Thus a = 2lna^, 
and 

2 r 

X = - - (11-66) 

n ao 

It is to be noticed that x represents a different variable for each energy 
state ; the quantum number n determining W appears as a scale factor in 
the dimensionless variable x. 

Some integrals involving /2n,o which occur frequently in physical and 
chemical problems, have been evaluated in sec. 3.11. See also the example 
at the end of sec. 3.11, which is of interest in this connection. 

For later use, we write down in explicit form the state function for the 
normal hydrogen atom. It is 

«1.0 = c,,oe-'/“»L} = 

For this state Yi = constant = when the function is nor- 

malized. Hence the total ground state function is 

^0 = (11-67) 

^-functions for the higher states are listed in explicit form in Pauling and 
Wilson.'® 

When the charge on the nucleus is not e but Ze, oo must be replaced by 
Oo/Z, so that 

( 73\ 1/2 

— 3 ) (ll-67a) 

iroo/ 

Pauling, L., and Wilson, E. B., Jr., “Introduction to Quantum Mechanics,” 
McGraw-Hill Book Co., 1935. 
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Problem a. 

(65) is 


Using the results of Chapter 3, show that the normalizing factor in 
= f (n-i- 1) 1 

^'’~\naj l2n[(n + f)!]*J 


Problem b. Work out the problem of the isotropic oscillator using spherical co- 
ordinates, and show that the results agree with those obtained in (42) and (43). 


11.14. Symmetrical Top. — In dealing with the problem of a rotating 
rigid body attention must be given to the kinetic energy operator. To 
obtain it we first observe that its form in rectangular coordinates, for the 
n particle problem (cf. sec. 11.31) is 




- L— 

2 » = i mi 


The position of a rigid body is best expressed in terms of the Eulerian 
angles, introduced in sec. 9.5. It was there shown that the classical 
kinetic energy is given by 

Tc = h + yi + kf) 

i = l 

*= sin^ y + + a cos y)^ 

Let us define a line element constructed from the Cartesian coordinates 

= V miXi, rji = V miyi, f » = V mi^i 

as follows; 

ds^ = + dvl + ( 11 - 68 ) 

This is clearly identical with 2Tcdt^, From the form of Tc in Eulerian 
coordinates it is seen that ds^ in these coordinates is given by 

ds^ = Ady^ + A sin^ yda^ + C{d0 + cos yda)^ (11-69) 

Now the quantum mechanical form of T is the Laplacian operator corre- 
sponding to the line element ds^j multiplied by — ft^/2. The problem is 
therefore to transform the Laplacian operator from a set of coordinates in 
terms of which the line element is given by (68), to a new set in terms of 
which the line element is (69). 

This problem has been discussed in sec. 5.17. If 

ds^ = T,Q\v.dq\dq^ 




VlYl dq). 




then 
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On identifying the from ( 69 ) we find (putting = 7, <?2 = «» 
93 = / 3 ) 

M 0 

(9xm) = ( 0 sin^ 7 + 0 cos^ 7 


and hence 


( 9 ^) = 


/i 

A 


C cos 7 


0 


“ \ 

C cos 7 J 

C / 


\ 


V 


1 


A sin^ 7 
cos 7 


yl sin 


2 


cos 7 

“ Z^n^” 

1 COS^ 7 

(7 ^ sin^ 7^ 


g 1 = sin^ 7 


When these results are substituted in the expression for we have 


2, 


+ 


sir 
\i\_A si 


sin 7 di/' 


I /sin 7 di/'\ 
[ ^7 \ A ^ 7 / 

sin 7 cos 7 


da u sin^ y da A sin^ 7 d/? 


L 




sin 7 cos 7 
A sin^ 7 dfi 


+ 


(~ 


sm 7 sm 7 cos* 7 
C A sin^ 7 


'Wl 

/^J 




-|— ,+ coti.^ + ^— s 
+ (c„tS++— 


(' 


Cj dl3^ 


sm 7 


dadp) 


Since the potential energy in this problem is zero, the Schrodinger 
equation becomes 

T\l/ = E\l/ 

It is separable; for if we put 

^ = u{a) • v{P) • 11^(7) 


the functions u and v are seen to satisfy equations of the form 


^ du ^ 

~ 7~2 “h 3 — I" 

da da 


d^v dv 


where the coefficients oq, ai, are not functions of a, and the coefficients 
b^, bi, 62, are not functions of Such equations have solutions 

u = V = e’*^ 
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tn and k being roots of algebraic quadratic equations involving the coeffi- 
cients a and b. However, these need not be solved here, since the con- 
dition of single-valuedness dictates that to and k be integers. We therefore 
put 


u = 


V = 


M, K = 0, ztl, ±2, etc. 

The Schrodinger equation now reduces to the following ordinary 
differential equation in the independent variable y : 

w" + cot yw' 

. / , , . i4\ „ cos 7 2A 




2~—KM - 
sin 7 


J 1/; = 0 


The substitutions 


^(1 — cos y) = X 
w(y) ^ ^ 

which are suggested when this equation is examined for its singularities 
along the lines of Chapter 2, transform it to 

d^F 


(x^ — x) + [(1 + 


dF 

~ n{v + n)F = 0 


the new parameters being defined as follows: 

p^\ + \k-m\ + \k + m\ 

q = \ -^\K - M\ 

—2 - —J + - ^(p - If - |(p - 1) 

This last relation, when rearranged, may he written 

® ■ s [(" + + VO ‘ ‘)^’] ‘“■™’ 

Reference to Chapter 2, eq. 56 will show at once that the differential 
equation for F is none other than the familiar hypergeometric equation 
defining the Jacobi polynomials, provided n is an integer. Unless this 
condition is satisfied, F will diverge for a: = 1, i.e., for y = w. 

Eq. (70) takes a simpler form when we introduce the new quantum 
number 

J = n + :=n + ^K~M\+^K + M\ 

A 
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which is evidently a positive integer or zero. We then obtain 

+ " + 

an equation which determines the energy levels of the symmetrical top. 
Note that the quantity K — M \ + 2 !^ + ^! is equal to the larger 
of the two integers K and M\ in consequence of this neither | K | nor | M | 
can be greater than J. 

The energy levels of the spherical top {A = C) are those already 
obtained in sec. 11.12 (cf. eq. 2-53). 


MATRIX MECHANICS 

11.16. General Remarks and Procedure. — ^The formulation of quantum 
mechanics we have given in the foregoing sections was historically preceded 
by Heisenberg^s matrix theory. The latter, while it appears at first 
glance to be an altogether different mathematical structure, strikingly 
produced the same results as the former. But when the initial amazement 
subsided lx)th formulations were recognized as equivalent. In the present 
text the Schrodinger-Dirac theory was discussed first because its axioms 
seem perhaps less strange, and because its point of view has been more 
widely adopted. The terminology of matrix mechanics, however, enjoys 
great popularity and is often conducive to clarity of expression. 

It is possible, and perhaps pedagogically worth while, to derive Heisen- 
berg^s theory from the postulates of part of this chapter. But when this 
is done, the impressive element of uniqueness which attaches to matrix 
mechanics is completely lost. To preserve it we proceed to state the basic 
facts of th(^ theory first, to give an example of its application, and then to 
exhibit its relation to the preceding developments. We can afford to be 
brief, for when the equivalence of the two theories is once established, no 
new insight is likely to be gained by deducing former results over again in a 
different manner. As before, attention will be limited to what we have 
called quantum statics. The principal facts of Chapter 10 will be used. 

Heisenberg associates with every observable a square Hermitian matrix. 
As in the Schrodinger theory, one of the chief concerns of matrix mechanics 
is the determination of the measurable values of an observable. Let it be 
desired to find the observable values of a quantity //, which, classi- 
cally, is a function of the Cartesian coordinates qi and momenta p»*, 
// = //(gi • • • 9n; Pi • * pn). In our example we shall specify H to be 
the energy, but this restriction is not necessary. Heisenberg's directions 
are these: 
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Find a set of matrices Qi, Q2, • • •, Qn', Pi, P2, • • •, which (a) satisfy 
the commutation rules 


PmP n PnPm Pm^n ^nP m 

(11-71) 

where F is the unit matrix; (b) render the matrix 

H{Qi ■■■Qn) Pi • • • Pn) diagonal (11-72) 

By H{Qi ■■■Qn) Pi- ■ Pn) is meant, of course, the matrix which is 
the same function of the matrices 0i • • • Pn that the ordinary function // 
is of • • • pn« The existence of the matrix H and its uniqueness will be 
assumed. When such a set of matrices has been found, the diagonal 
elements of H will he the measurable values in question. (It is also true that 
the squares of the absolute values of the elements (Qt)\n are simply related 
to spectroscopic transition probabilities, as will be shown later; but this 
does not concern us here.) We illustrate the power of the method by an 
example. 

11.16. Simple Harmonic Oscillator. — The Hamiltonian function is 
(cf. sec. 11) 

H = ^ + hrru»V 

2m 

Hence, if P and Q are matrices, 

HiQ,P) = + 

2m 

The straightforward way of working this problem would be to select a 
set of matrices such as, e.g., 

Q\n ” — ij Px/bi “ (11—73) 

which satisfy the commutation rule (71): 

(PQ)xm - (QP)xm = “^^Xm (ll-71a) 

as the reader may verify. These must then be subjected to a similarity 
transformation with some other matrix, say S, until the new matrices 

Q' = Sr^QS, P' = Sr^PS 

when substituted in H, make H a diagonal. (Cf. Chapter 10.) This 
procedure, however, is usually very cumbersome and is rarely used. The 
success of the matrix method depends frequently on fortunate guesses or 
on specific properties of the Hamiltonian. In the present instance the 
following considerations lead most directly to a solution of the problem. 
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Suppose that the matrices P and Q, which satisfy (71a) and make H 
diagonal, have already been found. Then 

Hki = + mWQ^hi = EkSici ( 11 - 74 ) 

2m 

provided we write Et for the diagonal elements of H, 

Now construct the matrix 

iP + imo) Q)(P — imuQ) = — imu{PQ — QP) 

= 2mH - hrrmE (11-75) 

the last step being justified by (71a). Similarly 

(P — imco Q) (P + imwQ) = 2mH + hnuoE (11-76) 

On multiplying eq. (75) by the matrix 

A = P — imwQ 

from the left, and (76) by the same matrix from the right, there results 
A(2mH — hmwE) = (2mH + hmo3E)A 

When this ecpiation is written in component form it reads, after can- 
cellation of 2m, 

^Ak\(^Ei 2 ^ 

or 

A,i (ei-j) = + 

Hence 

{Ek — + fi^)Aki = 0 

We conclude, therefore, that either 

a. Aki — 0 
or 


b. El — Ejc — fi<jo 

Let us first consider alternative a. If it is true, then in view of eq. (76), 


whence 




(11-77) 
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We may thus conclude that either 



or 

El = E}c "h fut) 

The first relation gives us the lowest energy of the oscillator, the second 
all the others. We see that the possible energies are \fujiy etc., in 
agreement with the results of sec. 11. 

11.17. Equivalence of Operator and Matrix Methods. — We first estab- 
lish a theorem of great importance in quantum mechanics. Consider a 
differential, Hermitian operator L of the kind discussed in sec. 4, which 
generates, through the eigenvalue equation 

L<iii = U<t)i 

a complete set of orthonormal functions <t>i. Whether <t>i is a function of 
one or many coordinates is unimportant in this connection. If we intro- 
duce other operators Af, N which act on the same variables as L we can 
clearly form two square arrays of numbers, i.e., matrices, by the rule: 

Mij s J <t>fM4>jdr, Nij = J <t>W<t>,dT (11-78) 

dr being the element of configuration space of the variables of <t>. The 
theorem asserts that equations which hold between the operators M and Ny 
also hold between the matrices formed by the rule (78). To prove this it is 
necessary only to establish this parallelism for the two fundamental opera- 
tions, addition and multiplication : 

{M + N)i^ = Mi^ + Nij (11-79) 

{MN)ij = ZMixNxj (11-80) 

X 

The first of these is at once evident from (78). To prove the second, 
let us expand the function N^j in terms of the 4>i themselves: 

N<i>j = 2:ax,</>x (11-81) 

X 

By the general procedure of finding the expansion coefficients,*® 

an = f <t>*N<i>jdr = Nij (11-82) 

The left side of (80) is, by definition, C <i>*MN4)jdT. On using (81) 


Multiply the equation by </>? and integrate, using the orthogonality of the 4>x. 
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and (82), this becomes {MN)ij= J <l>iM(t>\dTa\j = 

]CAftxiVxy, in accord with eq. (80). 

X 

If, then, we wish to form matrices satisfying relations like (71) or (71a), 
we need only find an operator which conforms to them, select an ortho- 
normal set of functions </> and construct the matrices by means of the rule 
(78). 

Problem a. The operators Q = and P = —ie^idldx) satisfy 

PQ ^QP -ih\ 


Use the functions <l>k = —7= k 

V 27r 


0, d:l, dr 2, • • to construct the matrices Pki, 


Qki. They will be found to be identical with those given in eq. (73). 

Problem b. Construct the matrices Xnm and Pnm, using X - x, P - —ih{dldx), 
and taking as the orthonormal set the normalized Hermite orthogonal functions dis- 
cussed in Chapter 3. Note that n and m can only be 0 or positive. 

A.718. X-nm ~ ^ {ft l )/2^5m,n+l “i" n./2^5m,n- 1 » 

Pnm ~ ih^(n — m)Xnm- [/3 is defined after eq. (39).] 

Show that these matrices satisfy (71a). 


It is interesting to note here that a Hermitian operator, defined by 
eq. (15), generates a Hermitian matrix (cf. sec. 11.10). For 


simply means 



= 


in our present notation. 

The success of Heisenberg’s directions is now easily understood. The 
differential operators which obey relations analogous to those prescribed 
for Heisenberg’s matrices (71) are 

Qm = <7m, / m = - T 

oqm 

in other words precisely the former, Schrodinger operators.^^ Suppose we 
select an orthonormal set of functions, <t>iy belonging to the operator L, and 
construct 

{Qin)ij ~ J* <l>tQm<t>jdrj etc. 


The fact that there are also others, like the ones considered in problem a, need 
not disturb us here. The Schrodinger equation which results when they are used 
appears different, to be sure, but reduces to its familiar form when a change of variable 
is made. 
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When these matrices are substituted into the functional form H the result 
is the same as if we had at once formed 

Hij = J <l>iH(t>jdT 

as follows from the theorem we have proved. But the only condition 
under which this matrix can be diagonal is 

H(l)j = const, (pj (11-83) 

that is to say, the ^-functions must be chosen to be eigenfunctions of the 
Hamiltonian H, The problem of making the matrix H diagonal is equiva- 
lent to selecting the proper i.e., to solving the Schrodinger equation. 
To see that the diagonal elements of H are the permissible energies E^ of 
the former theory, we need only substitute Hcpj = Ej<pj into (83), obtaining 

Hij = Eibij 

It is easy to extend the Heisenberg theory beyond the limits of the 
present development. The second postulate, cq. (8), is valid if P is 
interpreted as a matrix and as a vector. In the terminology of Chapter 
10, the xpx are then the eigenvectors of the matrix P, and the p\ are its 
eigenvalues. The relation of the eigenvectors to the state functions is not 
difficult to see. Suppose we choose a basic orthonormal set of functions, 
(pi. Expand the eigenfunction ipi appearing in the operator equation 

Ph = V^i'i (11-84) 

in terms of them, viz., ypi = ^ai\<p\, 

, X 

Now multiply (84) by cpf and integrate. We find immediately 

22 /^ j\ai\ — piaij 
X 

and conclude that the eigenvector \|/ has as components the coefficients 
which appear in its expansion in terms of the basic (p. More explicitly, 


an 

a»2 



If the basic set is identical with the eigenfunctions of the operator P, the 
eigenvector has only one non-vanishing component. 

Finally, even the third postulate, (14), may be retained in the Heisen- 
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berg theory if its form is suitably changed. We interpret 0 as a vector <j> 

with components Oi, the being the coefficients in the expansion of the 

function 0 = Z)ax</)x in terms of our basic (<f) without subscript here 
\ 

denotes an arbitrary state function, not necessarily one of the set 0i), but 
0* not as the complex conjugate, but the associate vector: 

• • •) 


P represents the matrix Pij = 
modified to 



p = <1)+P<}) 


Eq. (14) must then be 


which reads, when written more explicitly, 

X/x 


When the </)i are taken to be the eigenstates of the operator P, the 
matrix P becomes diagonal, and p = Zl^x^xPx, which is the same relation 

X 

as was found in the Schrodinger theory under these conditions. 


Problem. Calculate the integral 



x^e ^^Hn(x)Hm(x)dx 


by the methods of matrix mechanics. Let <^n = 4>m = 

where Cn, Cm are normalizing factors, and note that, aside from normalizing factors, the 
integral is the matrix element {x^)nm- Now is given by eq. (3-93) ; this may be used 
in calculating 

{X )nm — Xn\X\nXfiv ' * ’ X^m 

\n,v, • • ’a 


APPROXIMATION METHODS FOR SOLVING EIGENVALUE PROBLEMS 

11.18. Variational (Ritz) Method. — In Chapter 8 we showed that the 
differential equation L{u) + \wu — {pu'Y — gu + Xtyu = 0 is the neces- 
sary (though not sufficient!) condition upon u if it is to minimize the inte- 
gral A{u) = J* {pu'^ + qu^)dx. Furthermore, it was seen that A(w) could 

be transformed (cf. eq. 8-37) by simple steps to — J* uL(u)dx. The 

theory in this simple form is applicable to every one-dimensional 
Schrodinger equation, for in that case the Hamiltonian operator 
H = -- {h^l2m)((f/dx^) +,F(a:)is of the form — L if only we identify p 
with f? 12m and q with F. Hence we may at once say that the Schrodinger 
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equation is the necessary condition upon ^ so that the integral 



shall be a minimum. The one-dimensional variation theory may also be 
applied, though in a somewhat more cumbersome manner, to every ordi- 
nary differential equation to which the multi-dimensional Schrodinger 
equation gives rise on separation of variables. It is possible, however, to 
prove a far more general theorem which is of utmost utility in numerous 
problems of applied mathematics, a theorem of which the former statement 
is a special case. 

Let P be a Hermitian operator. We wish to find the normalized func- 
tion ip which will make the integral 

f 4'*PHr 


a minimmn. The integration extends, as usual, over configuration space, 
and we shall assume for the sake of definiteness that r is a finite portion of 
configuration space. Certainly, the necessary condition upon ^ is that 


8 


/ 


\p*P\l/dT 



shall vanish; X is an undetermined (Lagrangian) multiplier (cf. sec. 6.5). 
Now the variation symbol and the integral sign are commutable in this 
expression because the limits of the integration are supposed finite and 
fixed. Hence tve have 


J Srp* • P4>dr + f'P* ■S{P4')dT - X Jd^* rPdT J iP*SrpdT= 0 (11- 


■85) 


The second integral in this expression may be transformed in two steps. 
First, 8(P\I/) may be replaced by P(5^) since the operator P suffers no varia- 
tion. Second, because P is Hermitian and both \[/ and 5^ are acceptable 


functions, 


Jrpmdr^ f 


5^ • P'^^*dT, 


Eq. (85) therefore reads 


J H^iP^P - \p)dT + J 8P(P*^P* - \yp*)dr = 0 ( 11 - 86 ) 


Here is an entirely arbitrary function. Let us take it to be real, so that 
5^* = 5^. Eq. (86) can then be satisfied only if 

P^ - X^ + P*yp* - X^* = 0 
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On the other hand, if we take 6^ to be imaginary, so that 8^/* =» 
we conclude 

Pyp - - PV* + Xi/'* = 0 

Addition of the last two equations yields 

Pyp 

subtraction gives 

PV* = X^* 

We have shown that, if 

6 J ^*/Wt = 0 (11-87) 


for normalized this function must satisfy the eigenvalue equation 

P^ = H (11-88) 


which also automatically determines X. Whether, when (88) is satisfied, 
the minimum of, or indeed the integral, f \l/*P\l/dTf actually exists, is a 


point we have not investigated. It is customary in physics not to worry 
about these eventualities, for they are difficult to discuss. The mathe- 
matical equivalence of the minimal property of- the integral and eq. (88) is 
usually taken as a matter of faith. 


If t/' satisfies eq. (88), then J* \f/*P\l/dT = X. From what has been said 
it follows, therefore, that the integral J* ip^Pipdr computed with a function 


different from the minimizing cannot be smaller than X. But here a 
slight complication arises, for there are many eigenvalues X. All that we 
can really say is that for a function <p in the neighborhood of rpi, the 
integral will not be greater than X,. Certainly, however. 



T ^ Xo 


(11-89) 


if (p is any analytic and continuous function^® and Xo the lowest eigenvalue. 

The Ritz method, named after its inventor, is a systematic procedure, 
based upon the foregoing variational considerations, for solving the eigen- 
value equation (88) by substituting into the integral in (87) a suitable 
sequence of functions which causes the integral to converge upon the 
value X. Instead of presenting the method in its original form, we shall 

Restriction to functions with a certain number of derivatives is necessary because 
P IS in general a differential operator, and P<p must have meaning, 

Ritz, W., J.f. reine und angew. Math. 136 , 1 (1909); Courant-Hilbert, p. 150. 
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here work out some of its features in a manner more directly adapted to 
the needs of quantum mechanics, and with a slight loss of rigor. We are 
usually interested in finding the energies, particularly the lowest (normal 
state) energy of physical or chemical systems, hence we identify at once 
the operator P in (89) with the Hamiltonian H. 

The simplest way of finding an approximation to the lowest energy of a 
system is to use (89) directly. Sometimes a good guess can be made as to 
the general form of the true state function a form which may allow the 
inclusion of one or more arbitrary parameters. The integral in (89) is 
then computed with this function, and the result is minimized with respect 
to the parameters. An example will clarify the method. 

11.19. Example: Normal State of the Helium Atom. — The helium 
atom consists of two electrons moving in the field of a nucleus of charge 2e 
and at the same time repelling each other. We consider the nucleus as 
stationary and denote the distances of the two electrons from it by ri 
and r 2 respectively; ri 2 is the interelectronic distance. The potential 
energy is —2e^{l/ri + l/r 2 ) + a,nd the Schrodinger equation 

= \- ^ + vi) - 2e^(~ + -) + — U = E:,;- (11-90) 

I 2m Vi r2/ ri2J 

A subscript on the symbol V indicates that the Laplacian is to be taken 
with respect to the coordinates labeled by the subscript. If the term 
e^/ri 2 were absent eq. (90) would be separable, for then the operator H 
would be the sum of two helium-ion Hamiltonians, H = H\ H 2 y the 
first acting on the coordinates of electron 1, the second on those of elec- 
tron 2. But the equation 

(Hi + H2)^P = E\l/ 


may be separated on substitution of ^ = t/(l)y(2), where r«(l) stands for a 

function of the space coordinates of electron 1, and v(2) is defined similarly. 

For it becomes, after division by 

/ 


H^ujl) H 2 v{ 2 ) 
m(1) v{2) 


a constant 


which indicates that //'iii(l) = Eiu(\)] H 2 v( 2 ) = E 2 v( 2 ); Ei + E 2 = E, 
But the first two of these are simply Schrodinger equations for the singly 
charged helium ion, whose solutions we already know. (Cf. eq. 67a.) Since 
we wish to find the lowest energy of our system, we identify the functions as 
follows: ' 

( 73 \ 1/2 / 73 \ 1/2 

Trao/ \irao/ 

and ^ is the product of these. 
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The correct solution of eq. (90) is certainly not of this exact form 
because of the “ interaction term ” whose effect on ^ one would 

expect to be very complicated indeed. Aside from other changes, it will 
cause \p to depend on ri 2 explicitly. But from a physical point of view, the 
repulsion between the electrons will cause both of them to be, on the aver- 
age, farther away from the nucleus than if the repulsion were absent. This 
would mean that the functions u and v are in error with respect to the scale 
factor Z/oq. If this were smaller, a more extended probability distribu- 
tion would result. (For the helium ion Z = 2.) It would seem expedient, 
therefore, that we take as our trial function in the variational procedure 
the function 0 = u{\)v{2) but with an undetermined Z. 

In calculating 

J* (t)H<f>dT (11-91) 


it is well to have available the differential equation whose solutions are u 
and v: 

Ze^ 

- ^ V^(l) = — u(l) + Z^Ehu{1), v{2) = u{2) (11-92) 

2m ri 


Here En is the energy of the normal hydrogen atom. Eh = —e^/2ao 
(= ~ 13.53 c. volts). The differential dr in (91) represents, of course, the 
product of the volume element for the two electrons. When H is taken 
from (90), we find, using (92) and the fact that u is normalized, 


f <t>H<txlT = 2Z^Eh + (Z - 2)e^ J J ^dr (11-93) 

The integral 


f^dT= du • f u^i 2 )dr 2 = . r^dr sin Oddd^p 

J Ti J n J J r\ 


is easily computed directly. It has, in fact, already been evaluated (cf. 

— dr, 

f'2 

r 

has the same value. We leave the evaluation of I — dr for later: its 

^ ri2 

value is —^ZE h- Hence, eq. (93) becomes 


/ 


g2 e 

<l)H<l>dT = 2Z^E H “b 2) • 2Z ZE h 

CLo 4 

= Z[2Z ~ 4(Z - 2) - ^]Em 
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This expression is to be made as small as possible by choosing Z properly, 
i.e., the coefficient must take its maximum value because < 0. Putting 

the derivative with respect to Z equal to zero, we find for the minimizing Z 
the value 27/16, which is somewhat less than 2 as we expected. Hence 
the best energy value attainable by adjusting Z in our function is 
Z(27/4 — 2Z)Eh = 5.695Ej/. The energy found experimentally is 
5.807E if. The difference between these two values is to be ascribed to the 
defects of the simple trial function here chosen. 


Z 



A very interesting summary of the results of the present method as 
applied to helium is given by Pauling and Wilson. Their table shows how 
the value of the integral approaches the experimental energy as increasingly 
refined trial functions are used. 

To complete the analysis we indicate how the integral 



may be computed. The method is typical of the evaluation of ** double 
volume integrals involving the variable ri 2 , and hence perhaps of some 
interest. The volume element 

dr = ridri sin Bid^idipi • tytr 2 sin B 2 dB 2 d(p 2 
may also be expressed as follows: (see Fig. 4) 

dr = r\driWiBidBidip\ • rf 2 dri 2 sin 

Pauling and Wilson, p. 224. 
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Now rl = rf + ri 2 *“ 2riri2 cos whence r 2 ci!r 2 = Tir \2 sin yj/dyp provided 
r\ and ri 2 arc held fixed. By means of this relation sin may be elimi- 
nated from the last expression for dr, and we obtain 

dr = ridrir 2 dr 2 ri 2 dri 2 sin BidBidtpidx 

Substitute this volume element into /, and integrate at once over the 
angles, thus introducing the factors 2 • 27r • 27r. On using the abbrevia- 
tion a = 2Zla^^ we obtain 

7 = ^ S f S ^I<i^i'^2dr2ri2dri2 


The ranges of integration are: 0 < rj < “o, 0 < r 2 < » ; | rz — ri | < 
ri 2 ^ + ^ 2 . The absolute value sign on the limit for ri 2 forces us to split 

the integration over r 2 into two parts, (a) r 2 > ri, (b) r 2 < ri. In range (a) 
the lower limit of ri 2 is r 2 — ri, in case (b) it is ri — r 2 . Thus 


r "" 


^r2-4-ri 


/ e”“^V2dr2 

/ dri2 + 

J 0 t 

' ri 

' r2-ri 



-f r2 1 

1 e r 2 dr 2 

I 6‘"^Vidri 

/ drx2 

'o * 

J r -i ^ 

' r^-r^ 1 


Inspection shows that the two triple integrals are equal. The calcula- 
tion is now perfectly straightforward ; it makes use of the formula 


/■ 


e~^^x”dx = p 


and leads to the result 


I =\z- = 

8 oo 


-(n+t) 


\zEh 

4 


which was used above. 

11.20. The Method of Linear Variation Functions. — It is often con- 
venient to use as the trial function 0 in J* <t)*H(t>dT a linear combination of 

definite functions u, which are judged suitable for the problem at hand. 
The coefficients appearing in the linear combination may then be treated as 
variable parameters Thus, assume 


0 = L ax^x 

x-i 

where the need not form an orthonormal set. We define 


(11-94) 


f u*ii,dT = At„ f v^Hujdr — DCij, E — 


J* <t)*H(l>dT 
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The symbol in place of Hij is to remind the reader of the fact that the 
matrix JK* does not possess the simple properties of H because the former is 
not constructed with an orthonormal, complete set of functions. The 
denominator in the expression for E is needed to normalize the function 0. 
According to the variational principle, E > Eqj the lowest energy state 
of the system. 

We wish to find the condition that E shall be a minimum, and the mini- 
mal value of E. Insertion of (94) gives 

£? = Z / Z axU/i^XM 

/ X/i 

This expression will be an extremum, and we hope a minimum, ii E is so 
adjusted that dE/da* and dEJdak are zero for every k from 1 to n. Let us 
take the derivative with respect to a* on both sides of the last equation 
after it is written in the form 

E Za^a^Ax^ = Z^K^Xm (11-95) 

XyLt X/i 

The result is 

dE _ ^ 

Z^x^/iAx^ -f- E Z^mAas/x ~ /c = 1, 2, • • *, 71 

Cfdk X/i li fi 

When the first term is omitted {dE/dat = 0) the remainder of the 
equation represents the condition that E shall be a minimum. Differentia- 
tion of (95) with respect to leads in a similar way to 

E^at^xk = Z<^x3Cxa; 

X X 

an equation which is simply the conjugate of the former. Both may con- 
veniently be written 

- Aki^E) =0, /c = 1, 2, . . ., n (11-96) 

If this system of equations is to have a solution different from the trivial 
one: every = 0, then the determinant constructed from the coefficients 
of the must vanish. Thus 

3^11 ““ AiiJ? DCi2 — A 12 E • • • 3ffin — ^InE 

^21 ^2\E ^22 A22E • • • 3 C 2 n “* ^2nE 

=0 (11-97) 

tJCnl “ AnlJS 3Cn2 ^ ^n 2 E • * * Xnn “ ^nuE 

This is an equation of the n-th degree in E and therefore has n roots. The 

lowest of these will be an approximation to the lowest energy of the system. 
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The other roots approximate, though in general much more poorly, to the 
n — 1 higher states of the system. 

11.21. Example: The Hydrogen Molecular Ion Problem. — The 

H^-ion consists of two positive charges +e, which we shall consider station- 
ary and a distance R apart, and one electron whose distances from the 
protons will be denoted by ta and vb* See Fig. 5. 

The Hamiltonian operator is 



Fig. 11-5 


If the terms e^/R — e^/rs were missing, H would be the Hamiltonian of a 
hydrogen atom with its proton at A, whovse normal state function is (cf. 
eq. 67) 

On the other hand, if the terms e^/R — s^/ta were missing, the normal state 
function would be 

From a physical point of view one of these solutions is as good as the other: 
ua implies that the electron is entirely attached to proton Aj ub that it is 
attached to proton B. Neither is the case. Let us see what happens if 
we take for the variation function ^ a linear combination of ua and ub* 
We put 

0 = + gbUb 

using letters as subscripts rather than the number indices which appear in 
(94).^^ The lowest energy is at once obtained as the lowest root of (97) 
which takes the simple form 

~ ^aaE "DCab ~ ^abE 

= 0 (11-98) 

DCba — ^baE "Xbb — AbbE 

In more complicated molecules it is well to label electrons by numbers, nuclei by 
letters. We here follow this convention. 
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Now u*UAdT = Abb = 1, because tiA and ub are normalized. 

They are not orthogonal, but Aab = Aba- Similarly, !}Cab = J* UAHundr 

^tK’sA since H is insensitive to an interchange of A and B, With these 
simplifications the two roots of (98) are found to be 


^^AA +^AB 
1 + A^4B 


— ^AJg 

Aab 


If E 2 is inserted in (99), 


The ^-functions corresponding to these energies are obtained from (96): 
O'aC^aa — E) clbC^ab ^abB) = 0 

(11-99) 

o>a(Xba — ^baE) + asi^BB — E) = 0 
On inserting £' = Ei we get aA = as ] hence the corresponding 

01 = Ci(Ua + Ub) 

If 01 is to be normalized, Ci = [2(1 + Aab)]~^^^‘ If ^2 is inserted in (99), 
we find aB = — a^, so that 

02 = C2(Ua — Ub) 

The normalizing factor is in this case C2 = [2(1 — Aab)]~^^^- 

The remainder of the work is the computation of the three quantities 
Aab, 3K'xa and SK ab- It involves nothing new and will be left to the 
reader. The integrals are most easily evaluated in spheroidal coordinates 
(cf. eq. 5-40). ^ = (r^ + rsJ/Rf rj (r a rB)/R and the latter 
measured around R. In terms of these 

p3 

dr = (f^ — ri^)d^drjd<p, and uaUb = 

o 

l<{<oo; 

The following results will be found: 

A^b = e-4l +p + 1 


"Xaa = Eh + + J, where 


ua — uacIt = - — [1 - 
Tb ^ 

AB = (eh + ^ ^AB + K, where 

/ e^ ^ _ 

Ua —usdr = - — e "(p 
Ta K 


[1 - + p)l 




Ua — UBdr = — —e '’(p + p*) 
Va K 


(11-100) 
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The parameter p ^ /2/ao; Eh is defined as in sec. 19. The quanti- 
ties J and K are of interest. According to its definition, J represents the 
Coulomb attraction energy between a negative charge of density 
and the proton JS. The integral K has no such simple interpretation; it is 
called an exchange integral. Its importance is best appreciated if Ei and 
E 2 are written more explicitly with the use of (100) ; 

+ + 


£2 - £„ + ^ + J ^ 

Because K is negative, Ei is the lower root. Had we omitted the func- 
tion ub from our trial function </>, the variational result would have been 

,.2 


E 


E H + J 

li 


El is lower than this by virtue of the presence of K (and of course Aab)> 
But in classical parlance, a lower enerjgy must be regarded as due to the 
presence of additional attractive forces between the constituents of the 
system, i.e., a hydrogen atom and a proton. These forces would be 
given by dK/dR; they are commonly called exchange forces. They 
possess no classical interpretation; their significance is rooted entirely in the 
variational method through which they arise. 

Of course Ei is only an approximation to the true energy, which is 
lower for every R.] Its most important feature is that it possesses a 
minimum^ which explains the stability of the ion. Classical mechanics 
would yield no minimum and is therefore incompetent to account for the 
existence of this ion. A detailed comparison of Ex with the experimental 
energy is given in Pauling and Wilson.^^ 


Problem. Let wi, U2, uz be the three lowest energy states of the simple harmonic 
oscillator, //o its Hamiltonian. The Hamiltonian for an oscillator in an electric field is 
H = Hq kxj where fc is a constant. Calculate by the variational method the lowest 
energy of this system, using as trial functions (a) wi, (b) aiuj -f 02^2, (c) aiui -f -f 

azuz. 

Ans. (a) (b) hy — -h \{hv)^ ^ \hv — k^ 3 ^\/hvy (c) \hv — hvk^XQi 

[(Av)^ ~ k^xu] (approximately). Here Xij is defined as f u^xujdry as usual. 


11.22. Perturbation Theory. — The following problem is frequently met 
in quantum mechanics. We know the energy states of a given system, say 
an atom, and also its eigenfunctions. A small perturbation, such as an 


Pauling and Wilson, loc. cit. 
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electric or magnetic field, is now imposed; this changes, presumably by 
slight amounts, both energies and state functions. Mathematically, the 
situation is described in this way. We know the solutions and eigenvalues 
of 

( 11 - 101 ) 

where is the unperturbed ” Hamiltonian. We wish to find solutions 
and eigenvalues of 

H<t>i = H + H' (11-102) 

//' being considered as a small addition to H^. (By a small operator 
we mean one whose matrix elements, formed with the functions ^t, are all 
small compared with the diagonal elements of H^.) 

To solve the problem we use the method of linear variation functions, 
using as our trial function 

0 = 5:«x0x (11-103) 

X 

If we allow an infinite number of terms in this summation and choose 
the coefficients properly, we expect-0 to be the correct solution of (102), for 
the of (101) form a complete set. But since the ypi are orthonormal, the 
energies are given as the roots of (97) with every replaced by a 
Kronecker 5iy, so that E appears only in the principal diagonal. More- 
over, 

3C.,- = Hij = + H'ii 

. = J ^ptH'>P,dr 

Hence the determinant reads 

H'u Hjs H'u 

Hii 7/1, HU 

HU HU HU-(E-E^) HU 

HU HU HU HU-i.E-Efiy-- 


(11-104) 

If all its roots could actually be foimd they would indeed be the exact 
energies of our problem. But in the case we are visualizing certain simpli- 
fying approximations are in order. Suppose we are interested in the 
energy Ei, that is, the energy to which JE? is changed by the perturbation. 
{El need not be the lowest energy of our system, for the states may be 
labeled in an arbitrary order.) // JS? is o non-degenerate level, then Ey 
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will lie much closer to JE?? than to any other unperturbed This suggests 
the following approximations: 

a. Put = JEi in all diagonal elements except the first. 

b. Since every difference J?? — for t 5*^ 1 is large compared to H[i, 
the latter may be omitted in all diagonal elements except the first. 

c. Neglect all non-diagonal elements except those in the first row and 
the first column, since these affect E\ only in a secondary way. 


When this is done, the determinant reads (we now write for the 
perturbation — £7? we are seeking) 


HU HU ••• 

HU - JB? 0 0 

hU 0 Eg - 0 • • . 

0 0 Eg - E? • • • 


0 (11-105) 


It may be evaluated by the usual process of adding multiples of rows or 
columns. In this instance, multiply the second row by HUHE^ — E?), 
and then subtract it from the first. The element HU will then disappear 
from the first row, but the first element is converted into 


HU - AEi 


hUhU 

ttO rpO 
tj2 — h/i 


Next, multiply the third row by HU/ — ^) and vsubtract it from 
the first. The result will be disappearance of HU and addition of 
— HUhU/{E^ ~ En) I'O the first element. This process is continued until 
all non-diagonal elements of the first row have disappeared. We now have 

(^H'n - AS, - i - S?) • ■ • = 0 

If El is non-degenerate, as we are supposing, none of the parentheses 
except the first can be zero. We therefore conclude 

A£, = A/{/- (11-106) 

and this is the Rayleigh-Schr5dinger perturbation formula. The quantity 
HU is often called the first-order perturbation, the sum on the right is called 
the second-order perturbation. By retaining more elements in (104) third 
and higher orders may be computed, but these are rarely used. When the 
approximation (106) is not sufficient it is generally preferable to return to 
the variation scheme, or to find a more successful way of evaluating the 
determinant (104). 
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Formula (106) may, of course, be used to calculate the perturbation in 
any energy level which is non-degenerate; to show this fact it may be 
written in the form 

m = Hi, - L' (11-107) 

where we have also used the Hermitian property of The prime on the 
summation symbol indicates that the term in which X = fc should be 
omitted. 

Next, let us find the coefficients a\ in (103). They are obtained from 
(96) which now reads 

— ESjc^i) =0, A; = 1, 2, • • • 

In accordance with the approximations which led to eq. (106) we put 
E — El and neglect every unless one of the subscripts is 1. We then 
find 

aiHii -I- 02(^2 - F?) = 0 if k =--2 
®i^3i "h 03(153 — Bj) = 0 if k — Z, etc. 

Hence 

- ^ 1 

^ X 1 

— n/\ 


or in general, if we are interested not in Ei but in E,, 


Ox 


Hi, 

** — *x 


\ 9 ^ k 


(11-108) 


The coefficient a, must be chosen so that <t> is normalized. Since all other 
Ox are small, its value is very nearly unity and may be taken as such. 

Formulas (107) and (108) have been derived by assuming that the level, 
k, whose perturbation is being calculated, was non-degenerate. For 
degenerate levels both formulas obviously fail, for they contain terms with 
vanishing denominators (several being equal to jB*). To deal with the 
case of degeneracy we have to return to the fundamental determinant 
(104). If the functions ui, M 2 , •••,«« all belong to the same energy (we 
then say that the level .E? has an n-fold degeneracy), these fimctions are 
equally concerned in the perturbation, and if we formerly retained all 
matrix elements of the form H{,, we must now retain Hi,, Hi\, ■ • •, Hi, 
also. But for most purposes sufficient accuracy results if we neglect all 
elements connecting a state of the degenerate group with all states not 
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belonging to that group. Eq. (104) reduces in this case to 

//(i - H[2 ffi'a • • • H[n 

H'21 H!, 2 -AE His ••• Hin 

Hii His His-^-^ Hin 


HU HU HU •••//'n’-AJS 


0 (11-109) 


the n roots of this equation (of which some may coincide) are the energies 
into which will “ split '' as the result of the perturbation. They can- 
not, of course, be represented by a general formula. 

These energies are said to represent the first-order perturbation. If 
greater accuracy is desired the work may be continued in this way. By 
substituting the first-order energies into eqs. (96) and neglecting all states 
not belonging to the degenerate group, n vsets of coefficients ai, a 2 , • • ctn 
are found, each set belonging to a single first-order energy. This yields 
n functions 

n 

Vt = T, a.xM\ 

X = 1 


If now we construct matrix elements with the y-functions, //J; = 

vfHvjdry these will be diagonal; for solving (109) is the well-known 

procedure for diagonalizing the matrix H\ (See Chapter 10.) Hence, 
when the v-functions are chosen to represent the n degenerate states, the 
second order perturbation can be computed by formula (107), from which 
the terms with vanishing denominator are now absent because every H[\ 
corresponding to them is zero. 

11.23. Example: Non-Degenerate Case. The Stark Effect. — Let 

represent the Hamiltonian operator for any one-electron system, and 
let ^ 1 , ^ 2 , he its eigenfunctions. When a uniform electric field along X is 
applied, the term = —eFx is added to e being the electronic charge 
and F the field strength. The normal state of the S 3 "stem is non-degener- 
ate, hence formula (107) may be used. Denoting the normal state by the 
subscript zero, we find 

Mo = -efa;oo - T.' (11-110) 

Here xox = J* The first term on the right is usually zero because 

I ^0 1^ is an even function of x; thus the first-order Stark effect ” is absent. 

In classical physics, the increment in energy of an atom due to a static 
electric field is expressed in terms of the polarizability a in the form 

AE = -^oF^ 
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On comparing this with (110) we find for the polarizability of the normal 
state of our system 


a 


X 


Xqx 

JT’O ijtO 

Ji\ — /So 


For an oscillator, this takes a particularly simple form, since all Xq\ 
vanish with the exception of Xqi = a/ l/2ff (cf. Chapter 3, eqs. 92 and 93). 
Also, iSx ~ "f" ^)hp. Thus 


= 2e^ 


^01 

hv 


Comparison with the problem of sec. 21 shows that second-order per- 
turbation theory gives in this instance the same result as the variational 
method with the trial function ao^o + In general, however, the 

use of a simple variation function yields a much poorer result for the 
polarizability than the method of sec. 22. 

11.24. Example: Degenerate Case. The Normal Zeeman Effect. — 
The energy states of the hydrogen atom were found to be 

RnAr)Yi{e,^) 

To a given i, there belong 21 -^1 spherical harmonics of the form 

Yi= i CmPT (cos {Pr = PT) 

and each such combination with its own set of coefficients Cmy forms a proper 
eigenfunction when multiplied by Rn,i- The energy does not depend on m; 
the state under consideration has therefore a {21 + l)-fold degeneracy. 

Let us choose the 2/ + 1 functions in the simplest possible way, namely 
by letting each Yi contain only one term, as follows: 

ftn.z • /<:„.*• •••, Rni ■ 

and label them 0i, 02 , • * *, in that order. 

The Zeeman effect is the splitting of the energy levels of an atom in a 
magnetic field. When a uniform magnetic field along the Z-axis and of 
strength F is applied to the hydrogen atom, its unperturbed Hamiltonian 
takes on the extra term^^ 

H' = -iA- 

2Mc ^ difi ^ d,p 


Each matrix element //<;• = 




contains the factor 




ir^dr 


See Van Vleck, J. H., “ The Theory of Electric and Magnetic Susceptibihties," 
Oxford, 1932. We write here Af for the electron mass to avoid conflict with the summa- 
tion index m (magnetic quantum number). 
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which, by virtue of the normalization of the radial functions, is unity. If 

0 d(p 

and this vamshes In the same way all other non-diagonal matrix ele- 
ments are seen to be zero. The diagonal element H[i is 


— lA 




-Al 


and the others are similarly constructed. 

When these elements are substituted into (109) we have 


-lA - ^E Q 0 0 

0 -{I - l)A - ^E 0 0^ 

0 0 -{l-2)A-^E 0 

0 

0 0 0 lA - AE 


= 0 


The determinant is already diagonal, our choice of functions was a 
fortunate one. The perturbed energies are clearly 


AE = mA = m— — , m 
2Mc 


A; -/ + 1 , • 0 , 1 • • • / 


Classically, an electron in a magnetic field F performs a uniform preces- 
sion of angular frequency co/, = eF/2Mcj known as the Larmor frequency. 
Thus we see that AE = mhwi^. 


Problem. Calculate the Stark effect of the rigid rotator (cf. sec. 11.12), for the 
state I = 3, adopting the same choice for the spherical harmonics as above. Here 
H' — —eaF cos 0, provided the electric field F is along Z. The determinant will not be 
diagonal. To calculate the matrix elements, use formulas (3-48 and 53). 


TIME-DEPENDENT STATES. SCHR6dINGER»S TIME EQUATION 

11.25. General Considerations. — In all preceding considerations we 
have assumed that the states of the systems in question were stationary 
ones, that the time coordinate could be disregarded in describing them. In 
generalizing the theory so as to make it applicable to states which change in 
time it is well to look back and see why a time-free description was possible 
thus far. 

It is important to note that the time, t, in classical mechanics is canoni- 
cally conjugate to the energy, £, in the same sense that x is conjugate to 
Px. Let us then for the moment consider the operator = —ih{d/dx). 
Its eigenstates were seen to be (cf. eq. 9) ypp = What do they tell 

us about the distribution of the system in a;? The answer is, it is uniform. 
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Whatever is true at the point xi, is also true at the point X2. This is the 
meaning of the uncertainty principle applied to the case at hand: if the 
momentum is known with certainty, the state function is entirely non- 
committal with regard to x. If in the calculation of the mean value of an 
operator Q, 

Q did not depend on x, we could have afforded to neglect the factor of 
yj/p altogether. It had to be included, however, because most operators of 
interest do depend on x. 

But this trivial situation existed with regard to the time coordinate in 
all the Schrodinger problems considered heretofore. The states were those 
in which the energy was known with certainty, and for this reason the state 
functions were completely indiscriminate in respect to t. What was true 
at ti was also true at ^2* Moreover, the other operators used were inde- 
pendent of t. This condition will always be present as long as we are deal- 
ing with closed systems, for the energy will then be constant in time. 

When the system is an open one, the present method must clearly fail. 
But the last remarks contain the hint that we should, perhaps, associate 
with E the operator —ih(d/dt). This would lead to the eigenvalue 
equation 

d 

ot 

which is certainly too simple because the energy depends on other things 
beside the time. The example above gives us no definite lead at this point 
because px does possess the single dependence on x. There is, however, 
only one reasonable way to include these other variables, namely, to put 
them into Ej which thereby ceases to be an eigenvalue : E must be replaced 
by the Hamiltonian operator //. We then arrive at Schrodinger' s time 
equation 

-ih^^yP = Hyp ( 11 - 111 ) 

ot 

H is to be constructed as before by replacement of every Cartesian 
coordinate pi by —ih{d/dqi) and the dependence on t is to be introduced 
explicitly. 

It is immaterial, of course, whether we choose eq. (Ill) or its complex 
conjugate equation. The latter choice has certain advantages and will 
here be made. Furthermore, we shall use the symbol u (more or less 
generally) for time-dependent state functions and thus record Schrodinger's 
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time equation in the form 



( 11 - 112 ) 

This equation, being of the first order in t, permits prediction of the state u 
at any future (or past) time when u is known as a function of the coordi- 
nates at present. Although it is closely related to the preceding develop- 
ments, eq. (112) io a new postulate not derivable from those already given. 

The present theory must be valid also in the special case when H does 
not contain t. When that is true eq. (112) is separable. On writing 
^ ‘ ^ Qn) ' fit) it becomes equivalent to the equation 

Hxf/ dt 
— = zfe-r 


each side of which must represent a constant. But in view of the form of 
the left-hand side, that constant must be one of the eigenvalues of the 
operator //, say E\, so that 

dt h ^ 


Hence 




The general solution of eq. (112) for the special case in which H is inde- 
pendent of the time is 

(11-113) 


We have formerly said that any state function, such as w, could be 
expanded in the orthonormal system of functions This expansion was 
written as 

= Lux^x 

X 

We now see that this is indeed true even when the analysis is made on 
the basis of eq. (112), but the coefficients a\ are always functions of the 
time: ax = The mean value of computed for the state 

(113), is 

S = El CX = El ox |"£x 
X x 

It is independent of L But the probability of finding the system at the 
point qi — ‘ Qn of configuration space, ix*u = is a 

Xm 

superposition of oscillating functions of the time. The only way for this 
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time dependence to be obliterated would be to have c\ = 8\j in (113), in 
which case 

u*u == 

Thus, whenever a state is formed by superposition of energy eigenstates, 
the mean energy of the system remains constant, but the configuration of 
the system changes in time. The reader should note, of course, that the 
solution of the Schrodinger equation (12) when multiplied by is 

also a solution of (112), but that the solution of (112) does not in general 
satisfy (12). 

Problem. Let the time-dependent Hamiltonian be H ~ Ho -f F (0, where Hq 
acts only on space coordinates and has eigenfunctions eigenvalues Show that 

X 

11.26. The Free Particle; Wave Packets. — The eigenfunctions of the 
energy of a free mass point (cf. sec. 11.9) moving in one dimension without 

restriction are xpk - its energies and there is no quantiza- 

2m 

tion. The general solution of eq. (112) for the free particle is therefore, 

U = J (11-114) 

a function constructed after the manner of (113) but with an integral 
instead of a sum. An integral very similar to this has been already 
encountered in the mathematical formulation of waves (cf. eq. 7~38) and 
of diffusion phenomena (eq. 7-53). It is interesting to inquire what form 
u will have at some time < if at f = 0 it is given hy u = v^ix). The 
coefficient c(fc) may be determined by Fourier analysis. We have 



whence by eq. 8-13 

Eq. (114) therefore reads 

In this instance, the integration over k cannot be performed (as it 
could in the diffusion problem, sec. 7.14). To proceed further it is neces- 
sary to introduce the function ulq explicitly. 
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Assume that = e Then, with the use of the formula 


we find 


Hence 



C ^~may2-{-iih/2m)tWkx^^ 

's/ - «> 


(11-115) 


= (i + iAA 

\ ma / 


- 1/2 


exp 




(11-116) 


again with the aid of (115). 

Eq. (114) represents a superposition of waves of wave length 2Tr/k and 
frequency p = {hl4:Trm)k^, The form of i/o here chosen describes a concen- 
tration of waves about the origin, a phenomenon called a wave packet.'^ 
Such a wave packet does not retain its spatial distribution; eq. (116) is 
characteristic of the manner in which it diffuses. 

From the point of view of quantum mechanics, is the probability 
density of the particle at < = 0. It represents a Gauss error function of 
width a. At time t, 




a2 + 




The probability density is still a Gauss function, but of smaller maximum 
and of width \a^ + 

Problem a. Compute how long it would take an electron, localized within 
a = 10“^^ cm., to diffuse through twice that distance. 

b. How long would it take an object weighing one gram, localized within 1 cm., to 
diffuse through twice that dist^ance? 

c. Show that if wo = where iiC is a constant, the wave will be of the form 

^ ^iKx-(h/2m)Kh 

If our particle is free to move in three dimensions, then as shown in 
sec. 11.9, 

and fik “ A ** 

2m 

Hence (114) has the form 


(11-117) 
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Again, if u = uo{x,y,z) at i = 0 

1/0 = J* c(k)e*-"dk 

whence by 3-dimensional Fourier analysis 




the vector p having components Vy (• 

Assume now, in analogy with the one-dimensional case, that 


Uo 


^ ^-rV2a2 


At ^ = 0 the wave packet is a spherical concentration of waves centered 
about the origin ; the probability packet has a similar shape and a width a. 
On inserting uq into the relation for c we have 




-(aV2)*‘ 


This gives 

M= ( 27 r)-^^V Je-«»V2)+.W2n.)<]*?+a.x^^^ 
times two similar integrals with ki replaced by ^2 and k^yxhyy and z. Hence 



The interpretation of this result is not different from that of (116). 

Before leaving the subject of particle waves,” we should remark that 
every component wave of the packet (117), being of the form 
travels in a positive direction along k. Had we chosen the sign as in eq. 
(Ill) and not as in (112), the waves would have been of the form 
^i(kT+27riio^ which implies that they travel along — k. Since kh represents 
the momentum of the particle, the latter choice is an unsuitable one. We 
also note that the wave length X = 2T/k = 2Thfmv = conforms to 
the De Broglie formula. The phase velocity of the waves is v\ == hkl2m » 
mv/2m = v/2y but their group velocity, defined as 2Tr{dv/dk) = ?;, is 
equal to the classical speed of the particle. 

For a discussion of group velocity, see Sommerfeld, A., “ Wellenmechanischer 
Erg&nzungsband,” Friedr. Vieweg & Sohn, Braunschweig, 1929, p. 46. 
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11.27. Equation of Continuity, Current. — If the state function changes 
in time in accordance with the Schrodinger equation 

Hu = ihu (11-118) 

will it remain normalized? If it does not, there occurs a destruction or 
creation of probability; while initially there was certainty of finding the 
particle somewhere in space, there might later be uncertainty, a situation 
which would clearly be physically untenable. Permanence of normaliza- 
tion, however, follows immediately from (118). For 

— = J [u*u + u*u]dT ^ [uH*u* — u*Hu]dT 


because of (118), and the last expression is zero on account of the Hermitian 
character of H. 

Having shown that u*u is conserved we can define a probability current 
by subjecting which we will call p for the moment, to the equation of 
continuity 

^ + V I = 0 (11-119) 

dt 


Whatever I turns out to be must be regarded as the current correspond- 
ing to the flow of the quantity u*u. We shall limit our consideration 
to the case of a single particle so that 


although generalization to many-dimensioned configuration space is easy. 
Again because of (118) 

— = + u*u = - {uH*u* — u*Hu) 

dt ft 


ih 

2m 





(:a*Vu - uVu*) 


] 


To satisfy (119) we must pue® 


I = 


ih 

— — (u*Vu — uVu*) 
2m 


( 11 - 120 ) 


It is interesting to observe that a state u which has no complex depend- 
ence on a space variable has no current associated with it. Thus, in the 

This form of I is correct so long as the potential energy V is of the scalar form 
here used. When H contains a vector potential, A, the tenn (e/c)A must be added to 
the expression for the current here given. 
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free particle problem, cos kx and sin kx represent stationary states, but 
and have currents. 


Problem. Compute I for the various regions of the barrier problems considered 
in sec. 11.10. 


11.28. Application of Schrbdinger^s Time Equation. Simple Radia- 
tion Theory. — The cases in which eq. (118) can be solved exactly — except 
for the trivial one in which H does not depend on t — are not numerous and 
not very interesting. When the time equation (118) is not separable, 
resort must be taken to approximation methods, the most useful of which 
will now be illustrated. 

Let an atom, whose normal Hamiltonian function, free from all per- 
turbations, is Hqj be suddenly subjected to a light wave which adds a 
perturbing energy 

V{x^t) = — eFoX sin (11-121) 


to H, Physically, this means the light wave is monochromatic and has 
frequency v = a)/27r; its electric vector is along X and of amplitude 
If V did not contain x and sin cot in product form, eq. (118) with 
H = Hq + V would be separable; the fusion of x and t into V spoils 
separability. 

In solving (118) we use the following initial condition: At < = 0, when 
the atom was exposed to the perturbation F, the atom was certainly in an 
eigenstate of the operator Hoy say in the state corresponding to the 
energy Ei which we shall take to be the lowest energy of the system. Or, 
if we wish to include the trivial time dependence of the state, we take 

U = 

The solution of 

(Ho + V)v = itih 

which we desire, is certainly available in the form 

V = (11-124) 

X 


( 11 - 122 ) 

(11-123) 


Eq. (121) is a valid approximation for the purpose at hand. It neglects the 
energy due to the magnetic vector of the light wave whose contribution is small com- 
pared to (121) in the ratio v/c, where v is the velocity of the charge composing the atom 
and c the velocity of light. For hydrogen, v/c is 1/137. Furthermore, eq. (121 ) implies 
that the wave length of the light is small compared with the size of the atom. Correctly, 


/ 27rz\ 

V = — eFox sin ( ~ we are omitting the term z/\. The legitimacy of this 


will be clear from the following analysis. 

A more rigorous account of the general theory of radiation can be found in Fermi, E., 
Rev, Mod. Phys. 4 , 125 (1932). 
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provided we let the coefficients c be functions of the time. This follows 
immediately from the completeness of the with respect to functions of 
the space coordinates. When (124) is substituted into (123), there results 

ZcxiHoA + 

X X 

wherein each term //o^x on the left cancels E\\l/\ on the right. Let us now 
multiply the remaining terms of the equation by and integrate over con- 
figuration space, remembering the orthogonality of the ^x- Then, after 
simple rearrangement, 

Ci = - 7 k = 1,2, 3, ■■■ (11-125) 

n X 


where, as usual, 

FfcX = f ^kV^r 

If the unperturbed atom has an infinite number of states, (125) repre- 
sents an infinite set of linear differential equations, which in general can 
not be solved. But we now recall that at t = 0, v = u; which means that 
all Ck except ci were zero at that time. Thereafter ci decayed from 1 to 
some smaller value, while all other c^s grew from 0 to various finite values. 
We now limit our inquiry to times so small that ci is still sensibly unity, and 
the other c^s are small compared with it, although may be quite compa- 
rable with the time derivatives of other c’s. This permits the approxima- 
tion of replacing every cx on the right-hand side of (125) by its value at 
t = 0 , while retaining every 4 - The equation then beomes 

n 

To simplify writing we introduce the abbreviation 

Ek - El 


and observe that every > 0, since, as we are assuming, Ei is the lowest 
energy state. In view of (121), 
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On integration, 

ieFn fe'W-")' _ i 

Cfc — Xk\ 1 

Zn L ^ 


T; , k 7^ I 

wjfc + w J 


where we have at once adjusted the constant of integration so that Ck — 0 
when t = 0. For physical reasons, only the first term in the square 
parenthesis need be retained because it alone can attain appreciable magni- 
tude. (Both w and o)k> 0-) In fact c* is large only when co » cok, and 
this fact is accentuated when Ck is squared: 




^Fl, 


2h 


2 I 


Xk\ 


1 — COS (o)k — o))t 


{(jOk ~ c*))^ 





(11-126) 


We now interpret this result. The coefficient Ck is, in view of (124), 
the A;-th probability amplitude in the expansion of the state function v at 
time t in terms of energy eigenstates of the normal atom. Hence because 
of sec. 5, 1 tffc 1^ is the probability that at time t the fc-th energy level of the 
atom be excited; it is the “ transition probability from state 1 to state k 
when the atom has been exposed to monochromatic light of frequency 
a)/27r for t seconds. 

Many interesting conclusions of a physical nature can be drawn from 
eq. (126), of which only two will here be mentioned. First, the transition 
probability is proportional to the square of the matrix element connecting 
the states in question. Whenever vanishes, 1 ca; |^ = 0. Hence the 
vanishing of xu is the criterion of a ‘‘ forbidden transition. In the second 
place, the transition probability is small unless w « wjfc, which is the Bohr 
frequency condition. 

Problem. The reader may be surprised to find that j | ^ is not a linear function of <, 
as might be expected on physical grounds. Show that, when the incident light forms a 
continuoiLS spectrum of uniform intensity, | Ck is proportional to t. (For this purpose, 
(126) must be integrated over w from 0 to oo ; but the integration may without appreci- 
able error be taken from — <» to -f «>.) 


ELECTRON SPIN. PAULI THEORY 

11.29. Fundamentals of the Theory.— The theory so far developed 
describes the general behavior of atomic and molecular systems surprisingly 
well, but it makes some false predictions, particularly with regard to the 
finer details of the energy states of atoms, the Zeeman effect, and the mag- 
netic properties of electrons. It was soon apparent that the state of a single 
electron could not be represented as a function of three space coordinates 
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alone, but that another parameter was required whose interpretation was 
for some time in doubt. Most decisive in clarifying the situation was the 
spectroscopic observation of the doubling of the energy levels of a single 
electron: In all alkali atoms, for instance, two levels are found where the 
Schrodinger equation permits only one. The energy difference between 
these levels was such as would be produced by a small magnet of magnetic 
moment heJ2mc setting itself once parallel and then opposite to the mag- 
netic field present in the atom on account of the electron’s revolu- 
tion. Also, the angular momentum corresponding to these two energy 
states was known to be different; it was equal to that caused by the elec- 
tron’s orbital motion, plus h/2 in one, minus h/2 in the other state. 

Uhlenbeck and Goudsmit suggested that the electron behaves like a 
spinning top having a spin ” angular momentum of magnitude h/2 
which, however, can only add or subtract its whole amount, in quantum 
fashion, to any angular momentum the electron already possesses as a 
result of its orbital motion. Correspondingly, the electron generates by its 
spin a magnetic* moment of magnitude fie/2mc {m is the electron mass, c the 
velocity of light), and this also communicates itself in toto, cither parallel 
or in opposition, to any magnetic moment already present. 

To describe the electron spin as an angular momentum of the usual kind 
and to associate with it an operator like L (eq. 44) proved a fruitless under- 
taking, chiefly because L would have more than two eigenstates. The most 
successful procedure of including the spin in the quantum mechanical for- 
malism, aside from Dirac’s relativistic treatment of the electron, is that of 
Pauli which will now be described. What follows will refer only to the 
spin states of a single electron ; some applications to several electrons may 
be found in secs. 34 and 35. 

Since the three space coordinates arc insufficient to specify the complete 
state of an electron, we introduce a fourth, the spin coordinate,” and 
denote it by s^. It corresponds, in classical language, to the cosine of the 
angle between the axis of the spin angular momentum and the Z-axis of 
coordinates. This visual interpretation, while in no way dictated by the 
mathematical formalism, will be found a useful mental aid. Thus the 
state function of an electron has the form 

Since in all that follows, the hypothetical spin coordinates Sx and Sy are 
never needed, we shall henceforth delete the subscript z on s, but retain the 
above interpretation. Hence 0 = 0 (a:, 2 /, 2 ,s). Finally, it is well for the 
moment to abstract attention entirely from the space dependent part of 
the wave function, i.e., to consider x, i/, z as fixed, concentrating our inquiry 
solely upon the electron spin. Then0 = 0(s). 
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If 8f like X, y and 2, were permitted to assume a continuous range of 
values, difficulties would result. Pauli therefore postulates — in a manner 
admittedly ad hoc and designed to force success of the theory — that the 
range of s consists of only two points: s — dzl (classical meaning: spin 
vector is parallel or in opposition to Z). A function of s is therefore 
defined only at these two points. The most general spin function is, 
accordingly, 

(t>(s) = 065,4.1 + 665, _«i (11-127) 

where the 5’s are Kronecker symbols. 

Our postulates involved certain integrals over configuration space. 
But an integral over configuration space consisting of two points vanishes. 
It becomes necessary to redefine the integral as a summation over the two 
points: 

J Fis)ds ^ F(-l) + Fil) 

If <t>(s) is to be normalized, 

^(1 o 1^65,+! -f- 1 b 1^65, _i + (0*6 -f 6*0)65, 4.165, 

- 1 o [2 + I 6 [2 = 1 (11-128) 

In a very trivial sense, cq. (127) represents an expansion of a function 
0(s) in a complete orthonormal set of functions, 65,4.1 and 65,_i. To what 
operator do these two functions belong as eigenstates? The answer is 
suggested by intuition and will be justified by its complete success; it is 
the operator Sz which is associated with the observable: spin angular 
momentum along Z. We must now give thought to the mathematical 
structure of this operator. 

Empirical evidence cited in the introductory paragraphs demands that 
its two eigenvalues be ±:h/2. Hence it must satisfy the two equations 

>S*55,^_i = - 65, ^_i 

h (11-129) 

<S«65 ._i = ~ 2 

It is possible to show that no differential operator of the type encountered 
previously can satisfy these equations without giving rise to an infinite num- 
ber of other eigens^tes. But why search for the operator? The simplest 
point of view, and that here taken, is to regard eqs. (129) as a definition of 



389 


FUNDAMENTALS OF THE THEOUY 


11.29 


the operator The result of applying Sz to the most general function 

of s (eq. 126), can be constructed on the basis of (129), hence (129) exhausts 
the meaning of Sz and is its definition. 

To simplify the notation, and to be in accord with custom, we now intro- 
duce the symbol a{s) for and p{s) for Furthermore, we 

define a new operator 





which lias eigenvalues d= 1 , for the simple expedient to save writing. Then, 
in view of (129), 

aza{s) = a(s), = -0(s) (11-130) 


It is indeed possible and often useful to find an explicit operator in form 
of a matrix which will satisfy these equations. This matrix is easily formed 
by means of the principles outlined in sec. 17. Our eigenstates are = a, 

^2 = and we construct {(Jz)ij = J \l/f(Tz\l/jdT with the integral replaced 

by a summation. We thus obtain the two-square matrix 

- C -0 

To let it operate on what was formerly the function (/>($) the latter has 
to be regarded as a vector whose components are its expansion coefficients : 
If the function </> is given by 

0(s) = oa + 

a and b being numbers, then the vector <t>(s) is 





Thus, in the matrix representation, 



(11-132) 


and the reader will easily verify by the rules of Chapter 10 that the two 
eigenvectors of a z are </> == 


and <!> = [.)) where the values of both a 

<Oj 


An operator P is in general uniquely determined when the result of its action 
upon each member of an orthonormal set of functions is known. This method of defin- 
ing an operator is ordinarily not useful because an infinite number of relations Uke ( 129 ) 
would be required. 
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and b must be unity because of (128). The eigenvalues are, respectively, 


+1 and — 1. 


But the functions 0 corresponding to the vectors 



are clearly a and /3, which takes us back to the scheme (130). 

It is seen that there is a complete isomorphism between the two descrip- 
tions of the operator Sz and its eigenstates 0: One in terms of matrices 
and eigenvectors, where the rule of operations is (132) ; the other in terms 
of linear substitution operators and eigenfunctions, where the rule of opera- 
tions is (130). 

The question now arises as to the structure of the operators Sx and Sy^ 
associated with the other two components of the spin.^^ In endeavoring to 
construct them it is important to recall one significant fact concerning the 
ordinary angular momentum L: its components do not commute with one 
another. In fact (see eq. 7) 


X/jcZ/y 


LyLx XfxXj zy 


TjyLz 


Ijzl^y ““ xfl/Lx 


LzLx ifiL 


V 


Let us assume that the components of the spin S, this being an angular 
momentum operator, must be subject to the same commutation rules. In 
terms of <r rather than S, we postulate 

^ y (T yiT X 2f (T 2 , ^ y^ z ”” ~~ ^^^xy ^z^x ^x^z ^X(X y (11 133) 


These relations imply that an eigenstate of Sz, e.g., a(s) or p(s), cannot be a 
simultaneous eigenstate of Sx or Sy (sec. 7). 

The construction of <Tx and ay, Oz being given, is more easily performed 
in the matrix scheme. If we set ourselves the problem of determining two 
matrices Qx and Oy, which, when combined with Qz of eq. (131), obey (133), 
we easily find that the answer is not unique. But certainly the solution 

-’■-C S) "'-C '0 


is a possible one. The ambiguity here encountered permits just enough 
freedom to make possible a rotation of coordinate axes (see Chap. 15). 

Let us, then, accept (134) as our solution in matrix form. Clearly, ax 


has eigenvalues ±1, eigenvectors -y-l 




values zlzl, eigenvectors 




The observable values 


^ While we need only one spin coordinate^ all three components of the operator 
must be introduced because they appear in the Hamiltonian and other operators. 
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of all three components S*, Sy and Sz are therefore dzh/2. When these 
results are translated into the function language they read as follows. 

The equation <Tx<t>(s) = has two possible (normalized) solutions: 


X = 1, 0(s) = V§[a:(s) + fi{s)] 

X = -1, <t)(s) = V|[a(s) - P{s)] 


The equation = X0(s) has two possible solutions: 

X = 1, <t>{s) = V|[a(s) + ^/5(s)] I 

X = -1, = V^[a(5) - I 


(b) (11-135) 


The equation (Tz<t>(s) = X</)(s) has two possible solutions: 


X = 1, 4>{s) = a(s) ^ 

X = 1, <t>is) = ^(s) 


If now we write the eqs. (135a) in the simpler form 

(TxOi + (Txfi = a + (JxOL — (TxP — ^ (a — fi) 


and solve these by adding and subtracting, we find 


crx(a) = l3y (Tx(/3) = a 


The same procedure applied to (135b) and (135c) yields similar relations. 
Summarizing these results: The operators ax, ay, az may be represented 
either by the set of linear substitutions 


axoc = /3, aya = iff, az<x == a, 

axff = a; ayff = — ta; azff = — 


or by the matrices 





(11-136) 


(11-137) 


For practical use, the set of substitutions is to be preferred. 
Note that the operators 

^ i(<^* + to’v) 

and a'^ s ^(<Tjp — i(Ty) 

satisfy the convenient relations 

a'^a 0 (r~a j8 

a^ff ■= a a^ff *» 0 


They are sometimes called displacement operators/' 

We return to the consideration of the general state function of an elec 
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tron, which includes x, y, z and s as arguments. Such a function may 
certainly be expanded in eigenfunctions of cr*, i.e., 


<l>{x,y,z,s) = <t>^{x,y,z)a(s) + 4> ^{x,y,z)fi(8) 


Normalization now requires 



2 ^ J' 4>*<t)dxdydz = j* (0!|1(>+ + <i)!L<t>J)dxdydz = 


The operators Ox, do not act on and </>_ which are only functions of 
X, y, z; in other words, they commute with space coordinates. Thus, for 
instance, 

+ <Ty<f>—0 = <f>^Oya + <A_(7^/3 = i(j)^ — i<f>_a 
In the matrix scheme, 4>{x,y,z,syis rcpre.sented by the vector 

^ ^ / <t>+{x,y,z)\ 

\<p^ix,y,z)/ 

In the sense of this analysis it may be said that the introduction of the spin 
in the Pauli manner causes all Schrbdinger functions to become two-com- 
ponent functions. 


Problem. Carry out the algebra involved in finding the two Hermitian matrices 
(134). 


11.30. Applications. — a. Atom in a Magnetic Field. Our interest here 
is not in a complete solution of this problem, which may be found worked 
out in most books on quantum mechanics, but in its salient mathematic 
features. We wish to find the energies of a one-electron atom (e.g., hydro- 
gen or, with good approximation, the alkalis) when it is placed in a uniform 
magnetic field. The Hamiltonian consists of two parts, one acting on the 
electron’s space coordinates and which we shall call //q, and one acting on 
the spin coordinate. If the magnetic field tlC is taken along the Z-axis, 
the classical energy of a particle of magnetic moment p. would be p, • SFC = 
But empirically, the magnetic moment associated with the spin is 
{ft£l2mc)(T. We shall here write fi for the constant fiel2mc. In quantum 
mechanical transcription, then, the “ spin energy ” is iIKzOx where is 
interpreted as the operator (130) or (131). The Schrodinger equation 
becomes 

(Ho + = E'k (11-138) 

Let 

'9(x,y,z,s) * \p+(x,y^)a(8) + \ff-{x,y,z)$(a) 
and substitute, obtaining 

a(s)[Ho + fi’Xz - -I- fi(s)[Ho - iTK, - =0 
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provided relations (136) are used. Since a and are linearly independent, 
orthogonal functions of s, their coefficients in the last equation must 
separately vanish.^^ Hence we have 

= ( 2 ? - ^ (E + ( 11 - 139 ) 

Now let Eo be an eigenvalue of 2/o, \po the corresponding eigenfunction. 
The first of eqs. (139) then says E — = Eq, ot E ^ Eq + 

>A-f = ^0- On substituting this value of E into the second equation it reads 
//oiA- = (Eo + 2ju3Cz)^_, and this can only be satisfied by putting = 0 
because Eq + 2)u5K"z is not an eigenvalue of //q. Thus we obtain as one 
solution of (138) 

E = Eq + liXzi ^ = \l/o(x,y,z)a(s) (ll-140a) 

But we can also start with the second of eqs. (139) and assume to be 
^0, E + jTKz to be Eq. Then ^4. = 0 and we have 

E ^ E^ ~ fiXzy ^ = yp^{x,y,z)fi{s) (ll-140b) 

How docs the inclusion of the spin modify the eigenvalues and eigen- 
functions of the Schrddinger equation when there is no magnetic field? 
The answer is obtained by letting "X 2 vanish in (140a, b). Both values of E 
coalesce to Eq which now represents the ordinary Schrodinger energy in 
the absence of a field, but the functions ^ remain distinct. The spin thus 
introduces a degeneracy into the Schrodinger representation of states. 
Formulas (140) account — in a primitive way — for the doubling of the 
alkali energy levels, the field Xz being caused in that case by the electron's 
orbital motion, and not by external agencies. 

Problem. Solve eep (138) by the method of separation of variables, i.e., by putting 
'P = 'P{x,7j,z)<l>(s)y and show that (140) is the solution obtained by that method also. 

b. A Spin Problem. Having shown how spin and coordinate functions 
cooperate in the description of the state of an electron, let us omit further 
reference to space* coordinates and inquire what are the energies which an 
electron, placed in a uniform magnetic field of arbitrary direction, may 
assume regardless of its translational motion. The only energy of interest 
is that due to the spin. Let be the magnetic field strength. The 
Schrodinger equation reads 

yX'(T\l/ = -\-Xy(Ty +Xz<Tg)\l/{s) = E\l/(s) ( 11 - 141 ) 

If 3f is taken along Z, the equation reduces to 

pKagipis) = ( 11 - 142 ) 

This can be seen explicitly if the equation is multiplied by either a(s) or ^(a) and 
then integrated '' over s. 
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The operator on the left is but a constant multiple of Oz and must therefore 
have the same eigenfunctions as a-*, i.e., a. and /S. The corresponding eigen- 
values are at once seen to be S = dbADC. We shall show that eq. (141) has 
the same eigenvalues, but different eigenfunctions. 

Make the substitution yp = aa{s) + hfi{s) in eq. (141). On using, 
subsequently, relations (136) the result will be 

/x{5fCx(cti^ + t«) “ OCy(a0 — ba) ~~ ?>i3)} — E{aa + hfi) — 0 


As before, the coefficients of a and /3 may be put equal to zero separately, 
so that 


ixOC^za - iDCya -^Cjb) = 
}ji(d(J) + id<:yb+dCza) = Ea] 


If the equations are to have solutions a, 5, which are different from zero, the 
determinant of the coefficients of a, b must vanish : 


- iV(y) -jIKz - E 
PlD(z - E M(3fx + 


whence E = i/ZK. On substituting E = +fIK into the first of eqs. (143) 
and then taking the square of its absolute value, we have 

+!}C2)|ap = (DC +DC.)21 6 |2 


Let us call the angle between 3Ff and the Z-axis, 0, so that3C^ +^y = 
sin^ d, and 3 C 2 =!K' cos 6. Furthermore, in view of (128), | 6 |^ = 
1 — 1 a 1^. When these substitutions are made and the last equation is 
solved, there results 


a 1 * = cos* 2 ’ 


sin* 


0 

2 


When E = —fDC is introduced into (143), they lead to 



6 I* 


cos* 


6 

2 


We conclude that eq. (141) has the eigenvalues £1 = Eq =* —mK*, 
and the corresponding eigenfunctions 


}pi = cos - • a{$) + sin - • fi(s) 

e e 

^2 *= sin - • a(s) ~ cos — /3(«) 

The signs are so chosen as to makej | ♦ j ^2 1 and ] \pQ [ 
d is changed by 180 degrees. 


(11-144) 


i yp\ 1 when 
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Problem. Solve eq. (141) by diagonalizing the matrix 

and show that it leads to the same results. 


THE MANY-BODY PROBLEM AND THE EXCLUSION PRINCIPLE 

11.31. Separation of the Coordinates of the Center of Mass. — In 

classical mechanics, a system containing many particles and subject only 
to internal forces behaves in such a way that its center of mass moves uni- 
formly on a straight line. As a corollary of this theorem every classical 
two-body problem may be reduced to a one-body problem. A similar 
fact may be proved in quantum theory. 

The Schrodinger equation for a system of n particles of masses 
mi, • *, mn reads: 

(11-145) 

where V- = d’^/dx] + d^/dyi + The potential energy, F, is to be 

regarded as a function of the relative coordinates Xj — Xt, yj — z, — Zi^ 
We first transform to a new set of coordinates, defined as follows: 


n n 

A == 7“ 2^ m^x,*, M 
M 1 1 

«2 = ^2 ~ A, X3 = .r3 — A, • • x' = Xn — A 


(11-146) 


with similar relations for the y and z components. Note that x[ is missing; 
the coordinates of one particle have been eliminated by the introduction of 
the center of mass coordinates A, F, Z. In computing the sum of the 
Laplacian operators occurring in (145) in terms of the new coordinates we 
observe : 


dX dY dZ Mi dXj _ dy'j _ ^ — 5- — 

dXi dyx dZi M ' dXt dyt dZi Af 


Using these relations, simple differentiation yields 

^ ^ ^ " aV " aV \ 

dxl M^\dX^ ,T2dXdx'ii.f^2dz<dx'/ 

^ gf , f \ 

dx? “ \az» T dXdx;- J .* -2 dx'fdx'J 

I 2 — - T 4- 

MydXdXi y»2ax<axy/ axy® 


So long as relativity effects are neglected. 
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and similar expressions for the derivatives with respect to y and z. When 
these are combined we obtain, in place of (145), the equation 



= Eyp 


(11-147) 


Here is the Laplacian with respect to the center of mass coordinates, 
with respect to the primed coordinates. While V is not directly a 
function of the primed coordinates, it may be expressed in terms of them 
because Xj — Xi == Xj — x^. A difficulty might seem to appear in connec- 
tion with Xi — Xi because is absent from the primed set. But it is easily 


n 1 ^ 

seen that mixi = —'^ntiXi^ whence Xt — Xi — x'i '^rrijx' There- 

2 rrtij 

fore V, when expressed in terms of the new coordinates, will not contain 
X, F, or Z. 


As a result, eq. (147) is separable; therefore ^ may be written as 
^(Z,F,Z).</>(a:^..4). 

Correspondingly, jE = J5c + E\ where Ec is the energy associated with 
^(X,F,Z), determined by 


This is the Schrodinger equation of a free particle of mass M, it pro- 
duces, as we know, no quantization. The remainder of (147) describes 
the internal motion of the particles: 

It differs from the normal form of Schrodinger’s equation by the presence 
of the terms in Vi • V; and by the fact that V has a different functional form 
in the primed coordinates than in the unprimed ones. 

The coordinates (146) measure the position of the i-th particle relative 
to the center of mass. It is also possible to use a less symmetrical but 
physically more useful set of coordinates, which is closely related to (146). 
If we put 

n n 

Z = — 2 ‘rriiXi, M = '^m, 

M 1 1 

X 2 -X 2 — Xu X 3 = xg - Xu ■ ■ ■, x' = x„ - xi (11-149) 


thus measuring all coordinates relative to that one which has been elimi- 
nated (xi), we obtain in the same manner the equation 

r, 


2M 


^ 0 

2 2m I 


n- 


„ r + F 

2m\ »,j-2 


= (11-150) 
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This form is particularly useful when it is desired to calculate the energy of 
a many-electron atom, for particle I may then be taken to be the nucleus and 
the summations in (150) are extended only over the electrons. The equa- 
tion remaining after separation of the motion of the center of mass is now 

I - + — Lv ; • V') + fU = 

[ 2 \ , m mi 7 J 

where m is the mass of an electron, mi, that of the nucleus. It may be 
written in terms of the reduced mass 


as follows: 


mmi 
m + mi 


2 m 




/2 


_ , 

2 mi 


v; +T 0 = E'<t> (11-151) 


The terms in the double summation play an important role in the isotope 
effect of heavy atoms.^^ They are present whenever the number of elec- 
trons is greater than one. For the case of hydrogen, eq. (151) has the same 
form as Schrodinger’s equation for a stationary nucleus, except for the 
replacement of the electron mass by m- Hence the true energies of the 
hydrogen atom are not exactly given by eq. (64), but by that equation with 
ju written for m. 

Note that the function V is different in (148) and (151), and that the 
terms of the double summation have opposite signs. Nevertheless the 
equivalence of these two equations for the two-body problem may be seen 
as follows. Write for the potential energy in (151) 

V = V{x\y'y'z')j where x' = X 2 — Xi, etc. 


The 7-function of (148) must then be expressed in terms X 2 — X, 1/2 — 7, 
23 — Z. Now X 2 — xi = (^^1 + ^ 2 ) Therefore we must use 


mi 


in (148) 




mi + m2 , mi + m2 , mi -f- m2 


mi 


mi 


y 


mi 


*') 


and the equation reads 

[-7^^ ^ — ) V'* + V(ax\ay',az')\4>(x',y',z') = E'i{x',y'/) 

L 2 \m2 mi + tm/ J 

See Hughes, A. L., and Eckart, C., Phya. Rev, 36, 694 (1930). 
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where a = (mi + If here we put oa' = ay^ = y*\ az^ = z", 

it becomes 

which is identical with eq. (151). 

11.32. Independent Systems. — Physical systems are independent, or 
isolated from one another, if the Hamiltonian operator of one contains no 
terms referring to another system. There is then no interaction between 
them. Consider n independent systems, and let the coordinates of the 
r-th system (including the spin coordinate) be symbolized by the single 
letter qr- If its Hamiltonian operator is its Schrodinger equation will be 

(9r) (11-152) 

being the i-th eigenvalue of the r-th system. 

The state function describing the entire assemblage of n systems will 
satisfy the equation 

{Hi + i/2 + • • * ^n)^(^l,^2r • * 5n) = £'^(^ 1 , 92 ,* * * ^n) (11-153) 


To find its solutions we put ^(91,^2,* ' ^ Qn) = ‘ ‘ * ^^""^(^n) 

tentatively. Substitution in (153) and use of the fact that Hi acts only on 
qij etc., leads at once to the equation 








= E 


which shows ihat each term H is separately a constant, say E^’'\ 
and that the sum of all these constants is E. But if Hr = E^’'\ 

then must be one of the set of functions defined by (152), and E^’’'' 
one of the energies A’f Therefore 


'*'(91.92,- • • 9n) = if', -‘’( 91 ) ■ lf'f^(92) • • • lf'i'‘^(9n) 

- 1 - Ej^^ + • • • El”^ 


(11-154) 


This result is indeed what intuition would lead us to expect. For 
clearly the total energy of a number of isolated systems is the sum of the 
individual energies. Furthermore, if Wi is the probability that system 1 be 
found at gi, W 2 that system 2 be found at 52 , then the probability that both 
of these statements be true simultaneously is the product wiW 2 - Hence 
the individual ^-functions, whose squares are these probabilities, must like- 
wise combine as factors. 

This latter circumstance is dictated also by the time dependence of the 
Schrodinger states eq. (113). For only the product of the individual 
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functions etc., will have the factor e 

required in 4 '{qi,q 2 ,- ■ ■ qn)e~^'^''^^‘. 

11.33. The Exclusion Principle. — When two independent systems 
occupy the energy states and respectively, the combined system 
has an energy 

E = E^^^ + EP 

and a state function 

'J' = (11-155) 

We shall suppose for the moment that the individual states and 
are non-degenerate. Then, unless there happen to be two energies Ei^^ 
and Ef^^ whose sum is precisely the same as + Ej^\ the combined 
state (155) will also be non-degenerate. This will generally be the case 
when the two systems are different in a physical sense. 

But if they are similar, e.g., both electrons, or both hydrogen atoms, 
another situation arises. We may then drop all superscripts in the de- 
scription of the states, and write (155) 

E ^ E^ + Ej^ ^ = ^i(gi) • >Pj{q2) (11-156) 

This state is degenerate, although yj/i and are not; for if we interchange 
the indices i and j, or what is the same, interchange the coordinates q\ 
and q 2 in there results a different ^-function but not a different energy. 
This degeneracy, which is peculiar to the description of any aggregate of 
similar systems, is known as exchange degeneracy. Classically it implies 
that the energy of the total system is unaltered when two individual con- 
stituents exchange places. 

In the more general case where Ei has gi and Ej has gj linearly inde- 
pendent functions associated with it, the number of ^'s corresponding to E 
will be, not gigj, but 2gigj. 

Returning to the case of non-degeneracy of ypi and \l/j we note that the 
two functions 

which are linearly independent, are equally good representatives of the 
state in which E — Ei + Ej. Moreover, any linear combination of the 
two satisfies the Schrodinger equation for this value of Ey and has just 
claim to be considered. Of course, only two such combinations can be 
linearly independent. Let us then consider the function 

a^j + b^jj 

where we shall assume |a|^ + l5|^ = 1 to assure normalization. On 
exchanging the two systems, » Sl^// and hence the 
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function above transforms itself into 

b^i + a^Ti 

the numerical value of which for any given configuration (^ 1 ,^ 2 ) will in 
general be different from a^/ + b^u. Physically, this implies that the 
configuration which results when the two systems exchange places has an 
altogether different probability than the original, a consequence that is 
clearly objectionable. 

However, among all linear combinations there are two which avoid this 
dilemma. They are the symmetric^^ combination 

'^'s(QlfQ2) = ^ 1(^7 + ^//) 

and the antisymmetric one 

'^A{qi,q2) = 

They are independent and indeed orthogonal ; the first remains unaltered 
on exchange of systems, the second changes its sign. Both, therefore, yield 
probabilities ] V' 1^ which are insensitive to exchange. 

Consider now, not two, but n independent similar systems, in states 
• • • ^8- The assemblage has the energy E — Ei + Ej + • Eg, 

and is described by the state function 

'^(Qiyq2j- " qn) = ^i{q\)>Pj{q2) • • • ^«(gn) (11-157) 

But every permutation of the g^s among the ^^s on the right will produce a 
new function belonging to the same E, provided the subscripts, i, j, • • • s 
are all different (which we shall assume for the moment). Hence, if P 
represents any one of the n\ possible permutations of the ^'^s and 
^p(gi>g 2 r * ' gn) the function which results from (157) when this permu- 
tion is made, then 

^(9i»92,- ■ qn) = T,ap^p (11-158) 

P 

where the ap are arbitrary constants, one for each permutation (arbitrary 
except for the normalization condition), represents an acceptable state 
function for the energy E, Since there were originally n! linearly inde- 
pendent functions, there will also be n! linearly independent combinations 
of the type (158). 

Fortunately, most of these are uninteresting, for they cause 

I '*'(91,32,- • • 9») 1® 

A function is said to be symmetric with respect to a given operation if the opera- 
tion leaves it unchanged; it is said (in quantum mechanics) to be antisymmetric if the 
operation changes its sign without altering it in any other way. 
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to change when an exchange is made among any of the g’s. There are 
certainly two combinations, however, which preserve probabilities on 
exchange. One is the symmetrical, the other the antisymmetrical combi- 
nation. The symmetrical one is formed by making all the coefficients ap 
in (158) equal: 

^8 (5i, 92,- • • 9n) = (11-159) 

P 

the antisymmetric one by giving opposite signs to even and odd permutations 
(cf . Chapter 15) : 

'Pa (91, 92,- • • 9n) = (11-160) 

P 

A practical way of constructing (160) is to write the determinant 

h{qi)h(q2)MQ3) • * • Mqn) 
^j(qi)^j(q2)^j(q3) • • • ^}(qn) 

‘ (11-160') 


^8{qi)hiq2)i'8{q3) • * • ^8{qn) | 

which the reader will easily recognize as equivalent to the expansion (160). 

It is to these two functions, that we must confine our 

attention. Lest the simplicity of our formalism obscure significant details, 
we recall that qr stands for all coordinates of the r-th system. Thus, if 
the systems were electrons, yJ/jiqr) would be an abbreviation for a combi- 
nation of space and spin functions: 

^j+i^rjyri^r)oi{Sr) “L q^j‘-(.^r)yr}^r)P{Sr) 

in the notation of sec. 29, and an interchange of qr and Qp means that Xr 
is to be exchanged against Xp^ yr against t/p, against Zp and Sr against Sp, 
There is no a priori way of deciding which of the two functions, (159) 
or (160), is preferable. But here the exclusion principle, early recognized 
by Pauli, creates simplicity in a most effective way. It states that if the 
individual systems belong to a certain class (see below), only antisymmetric 
functions may he used in describing the assemblage. This principle is of the 
nature of a postulate; it has not yet been deduced from more fundamental 
axioms, although one might hope, from a mathematical point of view, that 
this will prove possible.^^ Why nature favors antisymmetrical rather 
than symmetrical states, is at present a puzzle. 

The elementary systems to which Pauli's principle is known to apply 

A very searching and interesting examination of the principle in the light of other 
fundamental issues has been given by Pauli, Phys. Rev. 68, 716 (1940). 


= (n!)-i^2 
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are: electrons, protons, neutrons; photons, on the other hand, are described 
by symmetrical state functions. 

Perhaps the most important consequence of the exclusion principle is 
this. Suppose our assemblage consists of electrons, two of which are 
described by the same function \pi (i.e., positional and spin parts are 
identical). The determinant (160') will then have two equal rows, and 
hence will vanish. We may therefore say: two systems obeying the Pauli 
principle cannot he in the same state. This fact governs the structure of 
atoms and molecules; each electron added to the shell of an atom must 
have its own set of quantum numbers. 

The exclusion principle makes it impossible to distinguish two states 
which differ only by an interchange of two constituent systems, a fact which 
has already been noted. 

Photons, which are described by the symmetrical function (159), may 
exist in identical states, because that function does not vanish when two 
sets of indices like i and j, contained in become equal. 

11.34. Excited States of the Helium Atom. — To show how the Pauli 
principle is applied we treat some of the excited states of the helium atom. 
The latter is to be regarded as a simple assemblage of 2 electrons moving in 
the Coulomb field of the nucleus (and under their mutual repulsion), 
hence the considerations of the foregoing section apply. However, in the 
first part of our treatment we shall ignore both the electron spin and the 
exclusion principle. 

The Schrodinger equation has already been given (eq. 90); it is 


(hx + //2 4 - ' 4 ' 


E'ir 


(11-161) 


where 


Hi = 



.2e^ 


U 


If the term e^/r \2 were absent the two electrons would be independent, and 
^ would be a product of the form ^t(gi) • \pj{q 2 )f E being Ei + Ej. More- 
over \l/i and yf/j would be hydrogen eigenfunctions with atomic number 
Z = 2, for Hi and H 2 are Hamiltonian operators for a single electron in a 
Coulomb field. To retain the notation of sec. 19 we shall now write u for 
the individual electron functions, so that, in the absence of the interaction 
term, 

^ = Ui(xiyiZi)uj{x2y2Z2) Ql-162) 


Fimctions of this type will be used as variation functions with the com- 
plete Hamiltonian (161). Let us first give thought to the proper choice of 
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Ui. The state corresponding to the lowest energy of a single electron is 
(cf. eq. 67a) 



(11-163) 


We are writing here, in place of the single subscript the values of the two 
quantum numbers n = 1 and I = 0. The first excited state is either 


U 20 — 
or 

^21 = 


The spherical harmonic To is ^ constant, but Yi is any linear combination 
of the three functions ^{(cos 6)e^^, Pi(cos 6) and P\ (cos It will be 

convenient to choose the following normalized combinations 


Tx 



[P}(cos d)e^^ + P}(cos d)e-^^] 


4 


— sin 6 cos (f 
Itt 




[P}(cos e)e^^ 


P}(cos0)e 



Y^ 




and to define^'* 

U20 — R20Y0 
U2x ~ R 21 Y X 

U 2 y = R21Y y 

U22 = R21Y z 


(11-164) 


as the four independent, orthonormal functions describing the first excited 
state of the one-electron system. The product (162) can be formed by 
combining uio with any one of the four functions (164); furthermore, the 
arguments can be interchanged in each of the functions thus constructed. 
We are therefore concerned with the following eight functions, each of 
which is a solution of eq. (161) with the term e^/ri 2 deleted, and belongs 
to the energy 

£0 = - — (1 + i) = (11-165) 

ao 

^ R 21 is given in eq. (66) ; its explicit form will not be needed here. 
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- Mio(1)«2o(2) 
^3 = Mio(1)m2*(2) 
\f'5 = Wio(1)W2|/(2) 
h = Wlo(l)M2i(2) 


^2 = «2o(1)W1o(2) 
\pi = M2*(1)Wio(2) 
^6 = W2^,(1)wio(2) 
^8 = 'U2*(1)Wio(2) 


( 11 - 166 ) 


In writing them we have indicated the arguments (xiyiZi) and 
simply by (1) and (2). A combination of these functions 

8 

‘I> == 2 

x-i 

will be used as a variation function in the sense of sec. 20. The best ener- 
gies of the system are given by (97), and this reduces at once to the form 
(104) because the ^ are orthonormal and belong to the operator 
= Hi + Hz- The perturbing term is 

#>2 

H' = — 
ri2 

The next step in the solution of our problem is the calculation of the 

matrix elements J ^tll'yl/jdxidyidzidxz/dyzdzz using the functions (166), 

the details of which may be left for the reader.^® Symmetry arguments 
may be used to show that 

H'n = H'zz, H'sz = Hii, 


HL = HL, Hir^HL 


*88 


and that only functions in the same line of (166) give non-vanishing ele- 


merits. Furthermore the volume element adopted in the evaluation of I 
(sec. 19) is convenient in proving: 

CO 

CO 

_ Tjf Tjf . Tjf _ rjf _ Tjf 

— /Iss — /i 77 , /I 34 — /I 56 — r /78 

Since the are real, //(, = Hji- We are left, therefore, only with the 

following matrix elements : 

H{i = j 

f ufo(l)w|o(2)^dT = J 

11 

C4 

uio(1)m2o(2) ^M2o(l)Mio(2)dT = K 

^3 = 

2 

f w?o(1)wL(2) —dr^j' 

f ri2 

II 

f Mio(1)M2x(2) — W2»(1)«io(2) s K' 

’ ri2 


“ See Heisenberg, W., Z. Phys. 39, 499 (1926). 
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In a sense previously defined, (see sec. 11.21) J and J' are Coulomb inte- 
grals, K and K' exchange integrals. 

The determinantal eq. (97) becomes 

J - ( K 0 0 

K J - e 0 0 

0 0 J' - t K' 

0 0 iC' J' - € =0 

J' - t K' 0 0 (11-167) 

K' J' -( 0 0 

0 0 J' - € K' 

0 0 K' J' - t 

provided we write e for — Eq. All elements not written are zeros. The 
determinant has two single roots: t\ = J — K, = J ■¥ K' and two 
triple roots: 63 = J' — K', *4 = J' -|- K'. The perturbation e^/ri2 may 


2K' 


2K 


Fig. 11-6 

therefore be said to change the one unperturbed level Eq into four per- 
turbed levels: Eq -(- ti, Eq €2, Eq -t- €3, Eo -f- €4, as indicated qualita- 
tively in the diagram (Fig. 6). 

To find the functions corresponding to the eight roots « we must return 
to equations (96) : 

<ii(J — <) + CI 2 K = 0 

OiK ci2(J — e) =0 
— e) -h a^K' = 0 

d^K' -h diiJ' — €) « 0 etc. 

On substituting €1 for t we find 02 = — Oi, 03 = 04 = • ■ • = 03 = 0 . On 
substituting e = «2, we find 02 = Oj, 03 = 04 = ■ • • = os = 0, and so forth. 
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We thus obtain the following set of energies and normalized variation 
functions : 



TABLE 1 



E 

4' 




\/j(^i — ^ 2 ) 

a 

Triplet 

-f / -f a: 

y/\{\pi 4- ^ 2 ) 

s 

Singlet 


V§(^3 — ^ 4 ) 

a 

Triplet 

E^^r -K' 

^^(^6 — H) 

a 

Triplet 


\/|(^7 — 

a 

Triplet 


+ i'A) 

8 

Singlet 

^0 -f /' -f /r' 

(^6 "f H) 

8 

Singlet 


y/\ -{- ^g) 

s 

Singlet 


It now becomes necessary to include the spin into our analysis. To do 
this accurately would require a modification of the Hamiltonian operator 
( 161 ), for the magnetic moments of the spinning electrons produce an 
interaction with the magnetic field due to their orbital motions and this 
interaction has not been included in ( 161 ). We shall omit this spin-orbit 
interaction and refer the reader to the literature for the more accurate 
treatment.^® In other words, we shall suppose that the Hamiltonian does 
not act on the spin coordinates. The state function is then separable and 
appears as the product of an orbital (any of the functions in the table) 
and a spin function, and the latter may be taken as an eigenfunction of 
for each electron. Let us consider these spin functions more closely. 
For the two electrons, we have four functions: 

a( 5 i)a(s 2 ), oi{si)P{s2)y 0{si)a{s2)y and 0{si)fi{s2) 

These, however, do not have convenient exchange properties, for when s\ 
and 52 a-re interchanged, the first and last remain unaltered, the second 
transforms into the third and the third into the second. ' But it is possible 
to construct from the second and third two other, equivalent functions, 
which are symmetrical and antisymmetrical with respect to an exchange of 
spin coordinates. They are, when normalized, V^[a{si)0{s2) + fi(si)a{82)] 
and \/^[a(si)0{s2) — P(si)a(s2)Y We have in this way obtained four 
spin fimctions 

2l = a(Si)«( 82 ), S 2 = '/il«(Sl)/3(82) + / 8 ( 8 l)a( 82 )], 28 « i8(8l)/3(«2): 

A = v/i[a(si)/3(s2) - /S( 8 i)«( 82 )] (11-168) 

Condon, E. U., and Shortley, G. H., “ The Theory of Atomic Spectra,” Macmillan 
Co., New York, 1936. 
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the first three of which are symmetrical, only the last being antisymmetri- 
cal. Furthermore, this set of functions is orthogonal (and complete). 

To include the spin we need only multiply each one of the functions in 
Table 1 by one of the spin functions Si to A, a procedure which yields 32 
different functions of position and spin coordinates. But here the exclusion 
principle effects a great simplification. It says that only functions which 
are antisymmetrical when all coordinates, i.e., position and spin coordi- 
nates, of the two electrons are interchanged, are to be permitted. Hence a 
function of Table 1 which is symmcitrical can only be combined with A, 
and a function which is antisymmetrical only with 2 i, ^2 and S 3 . 

Now the functions marked a in the table are antisymmetric; they can 
be multiplied by any one of the three 2 -functions. Each of them corre- 
sponds, therefore, to three states. For this reason the energy states 
Eq + J' — K' and Eq + J — K are said to be triplet states. If spin- 
orbit interaction had been included in our calculation each of these levels 
would have appeared as three closely adjacent levels, while the other 
energies, marked singlets, would have remained single. 

It is true that the functions in Table 1 are only approximate solutions 
of eq. (161). Nevertheless what we have said about their symmetry 
with respect to exchange of electrons may be shown to hold rigorously. 
The structure of the helium energy si)eetrum, and in particular the singlet- 
triplet character of the states, are therefore correctly given by the simple 
theory of this section; the numerical values of the energy levels will be in 
error. 

The normal state of the helium atom, whose energy was computed 
approximately in sec. 19 of this chapter, is given in the present notation by 
?/io(l)^io(2), if we neglect the spin. It is clearly symmetrical and can 
only be multiplied by A when the spins are introduced. Hence it is a 
singlet state. When the helium atom is in a singlet state, its probability 
of passing into a triplet state under emission or absorption of radiation is 
very small, as may be shown by an extension of the methods used in 
sec. 11.28. Hence triplet and singlet levels do not combine, and helium 
may be said to have two distinct spectra, the triplet spectrum to which 
spectroscopists apply the term “ orthohelium ” spectrum, and the singlet 
spectrum called “ parhelium spectrum. 


Problem a. Instead of using the 8 functions (166) as linear variation functions, 
start with the 32 functions obtained from (166) by multiplying each of them by Si, S 2 , 
S3, A. Show that, if these 32 functions are suitably arranged, the determinantal equa- 
tion is a four-fold repetition of the one obtained above, and that it yields the same 
results in regard to both energies and functions. 

Problem b. The following spin operators for two electrons may be defined: 

<Tt = -h <^*2 

= (^i 4- ^ 2 )^ * <r%i 4- o-Ji 4- 4- 4* <^^2 4- 4- 2(flr*icrx2 4* <ry\(Ty2 4“ <r,i(r,2) 
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where <tzi is the operator ax acting on spin coordinate si, etc. Show that Si, S2, S3 and 
A are all eigenstates with respect to both of these operators, in particular that 

<T*Si = 2Si, az'2^2 ™ 0, ^ ~“2S8, (TzA ™ 0 

<T^Zi = 8S1, a^X 2 = 8S2, «= 8S8, <t^A = 0 

Are these results consistent with the classical interpretation according to which Si 
is the state in which both spins are parallel and along Z, 

52 is the state in which both spins are parallel and perpendicular to Z, 

53 is the state in which both spins are parallel and along — Z, 

A is the state in which both spins are opposed and yield no resultant angular momentum? 

11.36. The Hydrogen Molecule. — One of the stumbling blocks of pre- 
quantum chemistry was the phenomenon of homo-polar binding; it is 
impossible to explain on the basis of classical dynamics the union of two 
hydrogen atoms to form a molecule. The only attraction which two 
neutral structures like H-atoms could possibly exhibit was due to quadru- 
pole forces, and these were known to be too weak to account for molecular 
binding. It was shown by Heitler and London that the homo-polar bond 
is caused by a typical quantum-mechanical effect: the exchange of 
the two electrons. Its meaning will be clear from the following discussion. 

The method of calculation^^ to be employed is a simple one which lays 
little claim to quantitative accuracy^^ but exposes the significant facts in a 
beautiful way. It is similar to the treatment of the Hj-ion, from which it 
differs by the presence of two electrons instead of one. The coordinate 
system to be used will be clear from Fig. 7; particles 1 and 2 are electrons, 



A and B are the protons whose positions are regarded as fixed. In connec- 
tion with Fig. 7, we also wish to outline the use of a coordinate system and 
a volume element which are very convenient in the numerical work involved 
in this problem. 

The coordinate system for the two electrons will contain the six variables 

Bh B2y ri2, 91; ^2; 

\ B 2 — Bi \ ^ ri2 + B2f 0 < B 2 < ^ 

\ Bi ^ R \ ^ Ai ^ Bi A" Rf 0 < jBi < 00 

87 Heitler and London, F., Z. Phys. 44 , 455 (1927). 

88 The most elaborate and accurate calculation, also employing the variational 
method was made by James, H. M , and Coolidge, A. S., J. Chem. Phya. 1, 825 (1933). 
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The volume element dr = dTidr 2 , where 


Now 

whence 


dri = AidAi sin diddidtpi 
Bl=^ A\ + R^ - 2AiR cos Si 


2BidBi = 2AiR sin Siddi 

On eliminating sin Biddi from dri by means of this last relation, we find 


dT\ = — A\dA\B\dB\d(p\ 


The element dr 2 is obtained by writing down an expression similar to dri, 
but using B\ as base line : 

dr 2 ~ ^ r\2dr\2B2^B2idfp2 
B\ 

Hence the product dridr 2 is 

dr = ~ A\dA\B 2 dB 2 'i'\ 2 dT\ 2 dB\d(p\d(p 2 (11—169) 


Several similar volume elements can be constructed by the same method. 

After this excursion, let us consider the Schrodinger equation of the 
H 2 -problem. It is 

■ (vf + Vi) - eM — -I- — + — + 

B2 A2 Bi ri 2 




2m 


I) 




(11-170) 


We endeavor to solve it by the method of linear variation functions, choos- 
ing as constituents of the trial function simple but reasonable approxi- 
mations to the correct \l/. If II did not contain the last four items in the 
parenthesis multiplying it would simply be the sum of two hydrogen- 
atom Hamiltonians, and 


where 


^ = ua(1)ub{2) 

M ^( l ) = ub(2) = 


are hydrogen functions centered about A and B respectively. On the 
other hand, if the terms l/Ai + I/B 2 — l/ri 2 ~ l/R were missing from 
the parenthesis, H would also be the sum of two hydrogen-atom Hamil- 
tonians, but xp ub (l)t^A (2). Both of these xp's are equally good approxi- 
mations, and both must be included in the trial function. Note that they 
differ with respect to an exchange of the electrons (or, what amounts in this 
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problem to the same thing, the protons). Hence we adopt 

= ciUA(l)tiB(2) + C2Ub(1)ua(2) (ll-170a) 


as variation function in minimizing J* ^H^r. 
the process leads to the secular equations 


As explained in sec. 20, 


ci(3Cn ~ ^uE) + €2(Xi2 - A 12AO = 0 j 
Cl ( 3(^21 — A21-E) “b 02(^22 — A 2 ^E) = 0 / 


(11-171) 


and E is given by 


Here 

~ ^iiE 

5K^21 — A21E 

3Ci2 

3 C 22 

All = 

f UA{l)UB(2)dTidT2 = 

II 

< 

Al2 = 

r ua (2)?/b(1 )w.i {2)dT\dT2 


Ai2^ 

A22E 


= 0 


(11-172) 


UA(\)VR0)dTi 


y 


— A21 


The latter integral is familiar from sec. 21, it is the quantity there called 
Aab- Hence 

Ai2 = A21 - e ^1 + p 4 * ^ ^ 

Next; we turn to 

^11 = J UAil)UB{2)HUAil)UB{2)dTidT2 


The V^-terms in H need not be calculated; their effect upon Wa( 1) and 
ub{2) is at once obtainable from the differential equations which these 
functions satisfy: 




OCii = 2E H 4- 2J 4“ 4~ 


R 


In this way we find 
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where 


and 


J~-e^f «^(l)4(2)5r‘drxdT2 = -e* f^^dn (11-173) 




4 ( 1 ) 4 ( 2 ) 


ri2 


dT\dT2 


(11-174) 


J is given in sec. 21, eq. (100), and J' has the value 

J' = p + ^ 

Problem. Prove this result, using the system of coordinates and the volume element 
( 169 ). 

P'urthermore, 

TT ®T/' 

.A 22 = 

as the reader will easily verify. In a similar way. 


where 


and 


5K'i2 = 3(21 = 2£//Ai2 + 2KAY2 + K' + Ai2 

K 


K= -e^f MA(l)MB(l)Br‘dr, 


K 


ri2 


dT\dT2 


(11-175) 

(11-176) 


The value of K is given in eq. (100), and 

23 „ 2 

P — op 


K' = — (— - 

5ool \8 




+ - [A(7 + In p) - 2V^Ei{-2p) + A'Ei{-4p)] 

p 


where 7 = 0.5772 (Euler-Mascheroni constant), 

A = Ai2, a' = e^'’ ^1 — P + 

and Eiix) is an abbreviation for the exponential integral 

/** e“ 

Ei{x) = I — du, 
u 


which is tabulated and discussed, for instance, in “ Tables of Sine, Cosine 
and Exponential Integrals,” Federal Works Agency, New York, 1940. 
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Problem. Evaluate K* . See in this connection, Sugiura, Y., Z. f. Phys. 46, 484 
(1927). (This problem is considerably more diflBicult than the former.) 


The two roots of (172) are 

tJCn +3Ci 2 „ e* 

B, - 2 £„ + - + 

1 — A R 


2J + J' + + K' I 

1 + A 

2J + J' - - K' ’ 

1 - A 


(11-177) 


Substitution into (171) shows that to Ei there corresponds the function 

= [2(1 + A)r^^\uA{\)uB{2) + Ub(1)«x(2)] (11-178) 

and to E 2 the function 

<1^2 = [2(1 - A)]-^%Ia{1)ub{ 2) - Uj,{l)uA2)] (11-179) 

The energies Ei and E 2 are plotted against /?, the internuclear distance, in 
Pauling and Wilson.^^ It will be seen that Ei has a minimum in the 
neighborhood of the experimental internuclear distance of the H 2 -molecule; 
at this minimum Ei is negative and equal in order of magnitude to the 
experimentally known minimum which causes the stability of the molecule. 
On the other hand, E 2 is positive for all R, decreasing in monotone fashion 
with increasing R. It, therefore, corresponds to repulsion between the 
atoms. Comparison of Ei and E 2 shows the difference in their behavior as 
functions of R to be predominantly due to the presence of the K and K' 
integrals. These would have been missing if electron exchange had not 
been taken account of by introducing the two functions constituting the ^ 
of eq. (170). In that case also, there would have been only one energy and 
not two. Now while (170) may be a crude approximation, the fact that 
two equivalent functions, differing only with respect to electron exchange, 
will compose the correct solution of (170) is beyond doubt, hence the quali- 
tative aspects here obtained cannot be questioned. The integrals K and 
K' are called exchange integrals. 

Let us now include the spin and apply the Pauli principle. The spin 
functions are those already encountered in the helium problem, eq. (168). 
If the resultant function is to be antisymmetrical, 4>i, which is symmetrical 
in the position coordinates of electrons 1 and 2 must be multiplied by an 
antisymmetrical function of the spins, of which there is only one, namely ^4. 
However, ^2 niay be multiplied by one of the three functions Si, S 2 or S 3 . 
It represents a triplet state while 4>i is a singlet. 

To the energy E 2 j therefore, there correspond three times as many 
quantum mechanical states as to Ei. From this fact may be drawn the 

Loc. cit., p. 344. 
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conclusion that when two H-atoms approach they will, ceteris paribus, be 
three times as likely to repel as to attract each other. 

Further Literature. — In concluding this chapter we add a list of books 
on quantum mechanics which the reader may find useful and interesting. 
To begin with source material, there are: Schrodinger^s charming volume 

Wave Mechanics (Blackie and Son, London, 1928) which is a collection 
of his epoch-making papers of 1926 and 1927; Heisenberg's more popular 

The Physical Principles of the Quantum Theory " (Chicago University 
Press, Chicago, 1930); De Broglie and Brillouin's Selected Papers on 
Wave Mechanics (Blackie and Sons, London, 1928); and Born and 
Jordan's Elcmentare Quantcnmechanik " (J. Springer, Berlin, 1930). 
The foundations of the subject, both mathematical and philosophical, are 
treated most thoroughly but also most abstractly by Dirac in his Princi- 
ples of Quantum Mechanics " (Clarendon Press, Oxford, Second Edition, 
1935) and by J. v. Neumann in Mathematische Grundlagcn der Quan- 
tenmechanik" (J. Springer, Berlin, 1932). A briefer and less technical 
r^sum^ of the philosophical foundations, together with an excellent treat- 
ment of the theory of measurement, may be found in the booklet entitled 

Thferie de I'Observation in Mechanique Quantique " (Hermann et Cie, 
Paris, 1939) by F. London and E. Bauer. 

General treatises of a didactic nature are: 

Condon, E. U., and Morse, P. M., '' Quantum Mechanics,^' McGraw- 
Hill Book Co., New York, 1929. 

Ruark, A. E., and Urey, H. C., Atoms, Molecules, and Quanta," 
McGraw-Hill Book Co., New York, 1930. 

De Broglie, L., Theorie de la Quantification," Hermann et Cie, Paris, 
1932. 

Frenkel, J., “ Wave Mechanics," Vols. I and II, Clarendon Press, Oxford, 
1932, 1934. 

Handbuch der Physik," Vol. XXIV, parts I and II (numerous authors), 
Julius Springer, Berlin, 1933. 

Pauling, L., and Wilson, E. B., Introduction to Quantum Mechanics," 
McGraw-Hill Book Co., New York, 1935. 

Kemble, E. C., '' The Fundamental Principles of Quantum Mechanics," 
McGraw-Hill Book Co., New York, 1937. 

Sommerfeld, A., “ Atombau und Spektrallinien," Vol. II, Vieweg und 
Sohn, Braunschweig, 1939. 

Simpler and less comprehensive treatments, written with pedagogical 
skill, are: 

Dushman, S., “ Elements of Quantum Mechanics," John Wiley and 
Sons, New York, 1938. 
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Rojanski, V., “ Introductory Quantum Mechanics,” Prentice-Hall, 
New York, 1939. 

Among the popular expositions of the subject, proceeding with a mini- 
mum of mathematics, the following may be mentioned ; 

Gurney, R. W., “ Elementary Quantum Mechanics,” Cambridge Uni- 
versity Press, Cambridge, 1934. 

Fliigge, S., imd Krebs, A., “ Experimentelle Grundlagen der Wellenme- 
chanik,” T. Steinkopf, Dresden, 1936. 

A list of books in which quantum mechanics is applied to special prob- 
lems follows. 

Van Vleck, J. H., “ The Theory of Electric and Magnetic Susceptibili- 
ties,” Clarendon Press, Oxford, 1932. 

Mott, N. F., and Massey, H. S. W., “ The Theory of Atomic Collisions,” 
Clarendon Press, Oxford, 1933. 

Condon, E. U., and Shortley G. H., “ The Theory of Atomic Spectra,” 
Macmillan, New York, 1935. 

Kronig, R. de L., “ The Optical Basis of Chemical Valence,” Cambridge 
University Press, Cambridge, 1935. 

Heitler, W., “ The Quantum Theory of Radiation,” Clarendon Press, 
Oxford, 1936. 

Seitz, F., “ Modem Theory of Solids,” McGraw-Hill Book Co., New 
York, 1940. 

Pauling, L., “ The Nature of the Chemical Bond,” Cornell University 
Press, Ithaca, N. Y., 1940. 

Rice, O. K., “ Electronic Stmcture and Chemical Binding,” McGraw- 
Hill Book Co., New York, 1940. 
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12.1. Permutations and Combinations. — The purpose of the present 
chapter is not primarily an exposition of the ideas of statistical mechanics, 
which is available in several modern texts, ^ but a brief and summary review 
of the chief analytical techniques used in the treatment of this subject. 
We begin by discussing the principal formulas of the theory of combinations. 

a. The number of possible permutations of n different (distinguishable) 
objects is n! 

The proof is simple: the first object can be put in n different positions. 
When its place is fixed, n — 1 different positions are left open for the sec- 
ond. Hence these two objects can be arranged in n(n — 1) different ways 
without disturbing the relative order of the remaining (n — 2) objects. 
But the third can occupy n — 2 different places, and so on. The total 
number of possible arrangements is therefore n(n — l)(n — 2) * • • 2 = n! 


b. Suppose we wish to arrange the n objects in r piles, the number in 
each pile being prescribed. Let the number of objects in the first pile be 


ni, that in the second n 2 , etc., so that 2^ rii = n. It is desired to find the 

t **i 

number, ilf , of possible arrangements of this kind. If M is multiplied by 
the number of possible permutations of all objects in the first pile, then by 
the number of possible permutations of the objects in the second pile and 
so on for all the piles, we must obtain the total number of permutations of 
n objects. Thus 

Mn\\n 2 \ • • • Hr! = nl 


whence 


M = 


n! 

n\\n2\ • ' * Tlrl 


( 12 - 1 ) 


There is another combinatorial problem which leads to the same result. 
Suppose the n objects fall into r classes, the objects in each class being alike 

^ Tolman, R. C., “ The Principles of Statistical Mechanics,” Clarendon Press, 
Oxford, 1938. Chapman, S. and Cowling, T. G., “ The Mathematical Theory of Non- 
Uniform Gases,” University Press, Cambridge, 1939. Mayer, J. E. and Mayer, M. G., 
“ Statistical Mechanics,” John Wiley and Sons, 1940. Lindsay, R. B., “ Physical Sta- 
tistics,” John Wiley and Sons, 1941. 
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(indistinguishable). Let the first class contain ni objects, the second 112 , 
etc. The number of possible distinguishable arrangements of the n objects 
will then be seen to be obtainable by the reasoning employed above. 
Hence M represents also the number of arrangements of n things groupable 
into r classes, the members of each class being alike. 

c. The number of ways in which m objects can be selected from a set of 
n objects is n\/[m\{n — m)!]. This follows at once from (1), for a with- 
drawal of m individuals is equivalent to an arrangement of the n objects 
into two piles, one containing m, the other (n — m) objects. We note that 
this number 

n! _ /n 
m\(n — m)l \m 


) 


( 12 - 2 ) 


It is often referred to as the number of combinations of n things taken w at a 


time. 


We observe that, since 



it is equal to the number 


of combinations of n things taken n — m at a time. 

Eq. (2) also provides the answer to another, apparently different ques- 
tion. Assume that we have n boxes, and a smaller number, m, of indis- 
tinguishable objects to be placed in them in such a way that no box con- 
tains more than one object. The number of ways in which this can be 
done is given by (2), for the assignment of m objects to n boxes is entirely 
equivalent to the selection of m objects from a set of n objects. 


d. When in accordance with theorem (c), a certain selection of m 
objects has been made, a permutation among these m objects does not 
produce a new combination. It does, however, produce a new arrange^ 
ment. Thus, to every combination given by eq. (2), there correspond ml 
arrangements of the m objects. The total number of arrangements of n 
things taken w at a time is therefore 


\m/ (n — m)! 


(12-3) 


If, in the problem of placing m objects into n boxes (n ^ m) discussed 
in (c), the objects are assumed to be distinguishable, so that our interest is 
no longer merely in the individual boxes each of which contains an object, 
but also in the arrangement of the individual objects placed in them, 
eq. (3) is applicable. It expresses the number of ways in which m dis- 
tinguishable objects can be placed in n boxes, zero or one object per box. 

e. The problem is more complicated when no restriction is imposed on 
the number of particles in a box. Let us determine the number of ways in 
which m indistinguishable particles can be put into n boxes, n m, any 
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number of particles from 0 to to being permitted in a box. We first note 
that TO can be written as a sum of t integers in 


N, 


<r-i) 


(12-4) 


ways, a fact which can be derived from eq. (2) and is also easily proved by 
induction. Now suppose that m is decomposed into t groups. These t 


distinguishable groups can be placed into n boxes in ways, in accord- 
ance with eq. (2), and all decompositions into t groups can be accommo- 
dated in the n boxes in Nt • ways. If we sum this product over all t 
we obtain the number in question; it is 

^ ^ /n\/m — 1\ 

Since Nq == 0, this sum may be written 

^ — 1\ ^ /n\/m — 1\ 

,?ovX< - 1/ ,?ovXto - t) 

In view of eq. (7), to be derived in the next section, the summation yields 

( n + w — 1\ 


This represents the total number of ways in which m indistinguishable 
objects can be put into n boxes. In the mathematical literature it is some- 
times known as the number of combinations with repetitions.^^ 

f . The number of ways in which m distinguishable objects may be placed 
in n boxes is clearly 

for the first object can be put into n places; with each of these dispositions 
of the first object can be combined n dispositions of the second object, 
and so on. 


12.2. Binomial Coeflacients. — ^The coefficients 
famous binomial expansion 


C) 


(a + b)" = £ 


appear in Newton’s 


(12-6) 
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where t is an integer. Its proof is fairly obvious, since the number of ways 
in which t factors a and (n — t) factors h can be selected from n factors 

(a + 6) is ways, by virtue of (2). We note that 



also 



if < > n, 


this because (n — t)\ 


00 


Most of the relations to be studied here are valid for non-integral values of 
n provided we define 

/i\ _ x{x — 1) • • • (x — t -h 1) 

vj “ 1-2 - •< 


An important series in binomial coeflficients may be obtained as follows. 

( 7X “b 

j is the coefficient of in the expansion of 

(o-l-6r+*. But 

<“ + - LI.C) LtC) ‘■H 


' 5 C)C) 

The coefficient of in this double sum is obtained by putting 

< -f- « = r and summing over t. Hence 



(12-7) 


This is known as the addition theorem of the binomial coefficients. 
From it, numerous other relations can be derived. 

On putting fc = 1, we have 

If fc =* r = n, 
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If we observe that 

we may also put fc = —1 in (7), obtaining 

If, in Newton^s formula, we let a = 6 = 1, we find 



but if a = —6, 

rV 12.3. Elements of Probability Theory. — An aggregate of elementSf such 
as a set of observations, a sequence of results of some operation (e.g., 
throwing a die), is called a probability aggregate if it is permissible to apply 
the rules of the probability calculus to the aggregate. Whether or not this 
application is proper is usually decided on the basis of intuition : it seems 
clear that the decimal expansion of the fraction y does not form an aggre- 
gate of digits to which probability considerations may validly be applied; 
on the other hand, no hesitation is felt in subjecting the outcome of a series 
of throws of a die to probability reasoning. In the former sequence 
(.142857142857, etc.) the digits occur with too much regularity to be 
regarded as distributed at random.’^ The criteria for randomness, 
which decide whether an aggregate is a probability aggregate, may be 
stated with considerable precision^ but will be omitted here. 

Every element is regarded as having one of a number, s, of distinguish- 
able properties. (Each throw of a die is an element, the number appearing 
uppermost is a property; s = 6. In measuring a physical quantity, each 
measurement is an element, each measured value a property; s may be 
infinite in this example.) If Ui is the number of times the t-th property 
occurs and n the total number of elements, 

Vi 

n 

is defined as the relative frequency of the t-th property. By the probability 
of the t-th property is meant the limit 

lim ^ =* Wi (12-8) 

n 

See, Lindsay, R. B., and Margenau, H., Foundations of Physics," John Wiley 
and Sons, 1936. 
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The existence of this limit is a matter which has given rise to considerable 
discussion; it will here be assumed.^ The totality of the Wi is called the 
distribution of the probability aggregate. Obviously, '^Wi = 1. 

i 

The properties may be discrete (throwing dice) or continuous (value of a 
physical quantity, such as position of a particle). In the former case the 
distribution is sometimes said to be arithmetical, in the latter case, geomet- 
rical, In the continuous case a different formulation of probability is more 
convenient. Let x denote the continuous property. The probability 
Wx, defined by (8) is clearly zero, but the probability that x shall lie between 
X and a: + Ax is finite and is, moreover, usually proportional to the range 
Ax provided this range is suflSiciently small. Hence we may write for this 
probability 

io(x)Ax 

and the function iy(x), which does not have the physical dimension of a 
probability (a pure number) is called the probability density. Clearly, 

r w(x)dx = 1 


if the integral is taken over the entire range of properties. 

When a distribution Wi or w{x) is given, certain expressions frequently 
occurring in statistical theories can be calculated. We present the most 
important of these, using parallel formulations for the arithmetical and 
geometrical cases. To make this possible, we write w{xi) for the former 
Wi, thus letting Xi represent the i-th property. 

If /(x) is a function defined for every x* (or x) which has a non- vanish- 
ing probability, the mean of /(x) with respect to the distribution w{x) is 
given by 


7 = 


i 

J* f(x)w{x)dx 


(12-9) 


^he dispersion of f(x) with respect to M>(a;) is defined b 


D{f) 


Ufixi) -7]V- 

i 

- f]^w(,x)dx 


(12-10) 


On taking for the function /(x) the variable x itself there results 

^,w(x,) 

J'xw{x)dx 

• For further remarks see Lindsay and Margenau, loc. cit., Chapter 4. 


( 12 - 11 ) 
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and 

[ - 5) *«(*<) 

D(a:) s <r» = * (12-12) 

1 / (x — S)^w(x)dx 

The quantity <t^ is called the dispersion of the distribution w(x), <r is 
known as the standard deviation. As is clear from its definition, cr is a 
measure of the spread of w{x) about its mean. If w{x) were regarded as a 
distribution of mass, a would represent its radius of gyration. By the 
r-th moment of the distribution is meant the quantity 

f J^XiW{Xi) 

r 

I x^w{x)dx 

For distributions with an infinite range of properties, higher moments do 
not always exist. The dispersion of w{x) may be expressed in terms of its 
first and second moments. In view of (12), 

— 2x^ + x^ — x"^ — 


Under certain conditions it is possible to expand a geometrical distribu- 
tion in terms of its moments, provided these exist. For simplicity we shall 
take these moments about x as origin, so that x = 0, etc. One 

can then prove^ that 


w{x) — 




_-xV 2 <r* 




where is the i-th Hermite polynomial, and 

x^ x^ x^ x^ X® ? 

C 3 = “3 , C 4 = 3, C 5 == —g 10 ~3 , Cq = “q 15 30 

<;■ <7 O’ cr (T <T 

This expansion is particularly useful when w{x) does not depart too 

greatly from a normal “ Gauss ” distribution: w{x) = 

Problems. Two geometrical distributions of considerable interest in physics and 
chemistry are 


wi(x) 


JL^ 

v;' 


V)2(X) 


a 1 

IT + X* 


^ See Zemike, F., Handbuch der Physik/' Vol. Ill, J. Springer, 1928, p. 448. 
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a. Show that, for wi, x = a and = l/2h^. 

b. Show that the r-th moment of iwi is 1 • 3 • 5 • • • (r — if r is even; for 

odd r it is zero. (Take a =» 0). All moments of Wi are finite. 

c. Show that, for all odd moments are zero and no even moment (except the 
zero-th) exists. 

12.4. Special Distributions. — A problem which is basic in statistical 
mechanics and in the theory of errors will here be discussed in some detail. 
It is of considerable historical interest, its solution being connected with the 
names of Newton, Bernoulli, Laplace, Poisson and Gauss. Consider 7i 
boxes, each containing P black balls and Q white balls. We wish to find 
the probability Wn(m)j that in drawing one ball from each of the n boxes, 
m of them will be white. 

The probability of drawing a black ball from a given box is clearly 
P/(P + Q) = p, that of drawing a white ball is Q/(P + Q) ^ q. Thus 

If ^ = 2, the probability aggregate has the 
following properties: 66, bwj why ww (6 = black, w = white), and these 
occur with the probabilities pg, pg, g^; hence 1 ^ 2 ( 0 ) = p^, W 2 { 1 ) = 
2pg, W 2 { 2 ) = g^. In general, the probability that m white balls will be 
drawn from n specified boxes and n — m black ones from the remaining 
boxes will be 


But in view of eq. (2) there are ( ) ways of selecting m boxes from a 

\Tfl/ 

total number of n boxes. Hence the answer to the problem, first found by 
Newton, is 


Wn{rn) 


<y 




(12-13) 


It ia clear from (6) that 


2 ; w„(m) = 1 

m “0 


since g + p = 1. Eq. (13) has of course a more general significance than 
the one here particularized; it represents the probability of m successes in 
n independent trials if the probability of success in a single trial is g. 

To calculate the mean of m and the dispersion of the arithmetical dis- 
tribution w„im) we consider the identity 

(p + «/)” = i: Wn(m)tr 

m »0 

where y is a variable. On differentiation with respect to y this reads 

«(P + = L mu)„(»i)2/’"~* 


(12-14) 
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When we let 2 / = 1 in this equation, the right hand side becomes m, so that 

w = ng (12-15) 

The mean number of successes is equal to the probability of success in a 
single trial, multiplied by the number of trials. 

To find the dispersion, we differentiate (14) once more, and then set 
1 / = 1. The result is: 

n ■ 

n(n — 1)^^ = 22 ^(^ \)wn{m) = — m 

m =0 

To obtain the dispersion we must add to the right hand side the quantity 
m — which, according to (15), equals nq — Hence 

(T^ = — nq{\ — q) — nqp (12-16) 

Especially interesting is the case where g p, so that p ^ 1. For then 
the dispersion is numerically equal to the mean number of successes, a cri- 
terion which can sometimes be used to determine whether the successes are 
due entirely to chance. For applications of the formulas here developed, 
particularly to the case of radioactive emission, the reader is referred to 
Lindsay's Physical Statistics. (See also the problem of the random walk 
at the end of this section.) 

For large values of n and m expression (13) is difficult to use because of 
the inconvenience in dealing with factorials of large numbers. We shall 
now prove that in this case Wn (m) can be approximated by the Gauss error 
law. Let us first see what happens to iVnim) as n 00 . It is clear from 
(15) and (16) that both m and (t^ tend to infinity, that is to say, if we were 
to plot Wni'fn) against m, the mean (which for sufficiently large m is also 
the maximum of Wn{m)) would move outward from the origin and the 
distribution would broaden out indefinitely. However, the quantity x, 
defined as the deviation from the mean and measured on a proper scale 
which contracts as n increases, namely 



(12-17) 


will remain finite. We shall try to convert Wn{m) into w{x)^ assuming that 
n — > 00 . 

First compute 

In Wn{rn) = In n! — In m! — In (n — m) ! + (n ~ w) In p + m In g 


Now by Stirling's formula, which is valid for large numbers, 

In n! = (r? + ^) In n — rt + ^ In 27r + ““ ^terms of order 
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Hence® 


— lim In Wnitn) 

>r— ► <» 


i In 


2irm{n — m) 
n 


In view of (17) and (15) 


m 


+ w In h (w 

nq 


m) In 


n m 


np 

(12-18) 


m 




— ) 


When these expressions are introduced in (18) there results 


“In w{x) = i In 2Tmpq + h 

Z q Z V 


provided we use the expansion of the logarithm 

ln(l-|“x) = x- — + ••• 


and retain no terms in negative powers of n. 

P + 9 = 1, 

—In w{x) = ^ In 27mpq + 


Thus we have, since 

2pq 


whence 


w(x) = 

V 2Tnpq 


When written again in terms of m it is 


(12-19) 


lim Wnim) = ^ g-(m-«)>/2<rS (12-20) 

. \/2Trpm V27r(r 

These results have a special significance with respect to errors of 
measurement, as can be seen from the following (oversimplified) argument. 
Suppose that the true value of a measured quantity is A, but that there 
are n causes of error, each of which will add to A the amount AA or — AA 
with equal probability. If m of these n causes contribute AA then the 
resulting error is rAA = [m — (n — m)]AA, and therefore the probability 
of this error is Wn(ni) with m = (n + ^)/2. For large n the distribution 
of errors is then given by (20) 

w(r) = 

y/2Tr<T 


• In arriving at this result, it is convenient to add to the literal expansion of the 
logarithm by Stirling’s formula the quantity (w In n — m In n). 

® Note however, that is no longer the dispersion with respect to the r-distribution. 


Furthermore, 



w{r)dr 
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Ifpsssgra^, n/2 Eiid ? *= 0. If WO deiiote rAA by €»., Gauss^ error 
law 

w(er) « const, 

immediately results; the constant h, which depends on Ail and is always 
determined empirically, is often called the “ measure or index of precision.” 

In the analysis leading to (19) quantities of the order l/nq and l/npq 
were neglected, the assumption being that p and q are numbers not greatly 
different from unity. Under these conditions the mean of m, ng, is a large 
number. This, then, is a criterion for the applicability of eq. (19). 

It may happen, however, that q is small, so small indeed that nq is of 
order unity in a given application. In this case the distribution (13) has, 
to be sure, spread out indefinitely {n oo) but the mean has remained 
small; the resulting distribution is quite asymmetrical. To deal with this 
situation we put 



and treat a = m as a number of order unity while n tends to infinity. Thus 

I a” \ n/ 


u)„(wi) = 


n(n — 1) ■ • • (n — m + 1) i 


r" _ ay 

^ / «Y— ~ ^ ) 

\ n) m\ _ aj" 

As n— ♦<», the last fraction takes on the value 1. Hence, under these 
conditions, 

o^e-" 


Urn w„{m) = 


ml 


(12-21) 


since 


Iim(l + -Y = 

n — ><» \ 


Formula (21) was first derived by Poisson and bears his name. It is used 
in the theory of radioactivity. 


Problem a. Plot Wn(m) for g = n = 6, 10, 60. Observe the change from an 
asymmetrical to a symmetrical distribution. Compare Poisson’s formula with the 
plot for n « 6. 
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Problem b. Handam walk.*' A person, making steps of length is just as likely to 
step forwards as backwards (p = g » i). Prove that, after taking n steps, he will 
have gone forward a distance rl with a probability 



Show also that r == 0» « n. 

12.6. Gibbsian Ensembles. — It is the main purpose of statistical 
mechanics to provide a formalism by means of which the facts of thermo- 
dynamics (cf. Chapter 1) can be deduced. This may be done in several 
different ways, that is, on the basis of several distinct sets of fundamental 
axioms. Two of these stand out for their success and clarity. One, the 
system of Gibbs, is particularly suited to a development of the classical 
laws of thermodynamics, i.e., those relations whose understanding is 
possible without the use of quantum mechanics. Gibbs’ statistical mechan- 
ics will be summarized in this section and the next. The remainder of the 
present chapter will be devoted to the method of Darwin and Fowler with 
the aid of which the subject of quantum statistics is most satisfactorily 
discussed. 

The central concept of Gibbs’^ theory is the ensemble, the meaning of 
which will now be discussed. Statistical mechanics deals with certain 
properties of physical objects, as for instance a given body of gas, or liquid, 
or a solid. Such an object will be called a system, or, more specifically, a 
thermodynamic system. If it has n degrees of freedom, then its complete 
mechanical state can be specified in terms of n generalized coordinates and 
n generalized momenta, a total of 2n numbers. Mathematically, these 2n 
numbers may be said to define a point in a space of 2n dimensions, and this 
space is called the phase space of the system. At any instant of time, the 
system is represented by one point in its phase space, and in the course of 
time, this point will move, describing a certain trajectory in phase space. 
When the position of the representative point at any instant is known, it is 
theoretically possible by the laws of mechanics to calculate its position at 
any other time, but such a prediction is practically not feasible. Other, 
less detailed methods of description must be chosen. 

In the simplest instance of a thermodynamic system, the ideal gas con- 
sisting of V molecules, n = 3v, and the phase space has Qv dimensions. A 
representative point would correspond to an exact assignment of 3 com- 

^ There is no more lucid and careful exposition of J. W. Gibbs^ ideas than his own, 
“Elementary Principles of Statistical Mechanics C. Scribner’s Sons, 1902; 
Collected Works, vol. II, Longmans, Green & Co., 1928, New York. See also 
“Commentary on the Scientific Writings of J. Willard Gibbs,” Yale University Press, 
1936. 
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ponents of momentum and position to each of the p molecules, and the 
path of the point would portray the changes which the values of all these 
quantities undergo in time. In this case, another picture is often useful. 
One may regard the phase space of the system as being composed of p 
subspaces, one for each molecule. Such a subspace is called a /x-space 
molecule space ”) in order to distinguish it from the entire phase space 
which is often designated as 7-space (“ gas space In the case of mole- 
cules regarded as mass points, ju-space has 6 dimensions, although in general 
for molecules having internal degrees of freedom the number of dimensions 
is greater. Use of the /x-space is often very convenient, but it loses its 
significance except as an approximate description when strong interaction 
exists between the molecules. 

We shall denote the n generalized coordinates of our system by 
Qig 2 ' ' ’ Qn, the generalized momenta by • • Pn* Out of these we 
construct an element d<t) of phase space in which the p’s and g’s are taken as 
Cartesian coordinates: 

d<t) = dq\dq2 * • * dqndpidp2 • • ‘ dpn 
It is possible to show^ that any point transformation 
7/ = (71 * • • 7 n) 

Pi = Pi{<l\ ■ ■ • Q», Pi • • • Pn) 
leaves d(t> invariant ; thus 

d<t) = dq[ ' ' dqndp{ • • • dpn 

Since the system is assumed to obey mechanical laws, Hamilton’s canonical 
equations must be valid (cf . 9-13) : 

dH dhl 

Pi = - — » , ? = 1, 2, • • • ^ (12-22) 

dqi dpi 

From these equations it follows at once that through every point in phase 
space there passes but one trajectory; for when every p and q is given, 
equations (22) determine uniquely the rate of change of every coordinate in 
phase space. Hence the representative point can never cross its previous 
path. Whether the motion of the point will ultimately carry it through all 
regions of phase space has not been proved completely; such behavior is 
tentatively asserted by the so-called ergodic hypothesis® which, however, is 
not needed in Gibbs’ formulation of statistical theory. 

It would seem that the values of thermodynamic quantities such as 
temperature, pressure, etc., could be regarded as time averages over the 

^ See Gibbs, loc. cit. or Lindsay, loc. cit 

® See Tolman, loc cit. 
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motion of the representative point of the system in its own phase space. 
The development of this conjecture, however, is fraught with rather for- 
midable difficulties and is not usually attempted. Instead, Gibbs intro- 
duces what he calls an ensemble of systems, by which is meant a very 
large set of imagined replicas of the one real system under consideration. 
These systems are not in identical states, but the state of each is repre- 
sented by a phase point in its own phase space. Since all imaginary 
systems of the ensemble are similar as to number of molecular constituents 
and Hamiltonian function, all points can be plotted in the same phase space, 
in which they will be distributed with a certain density, D. 

This density will in general be different in different parts of phase space, 
and it will change in time. Hence 

D = D{pi • • • PiifQl ‘ * * Qn]i) 


Nothing has as yet been said about the initial distribution of points in 
phase space which, in view of the meaning of the ensemble, is quite arbi- 
trary. Whatever the functional form of D, we must require 



for every t 


if iV is the (very large) number of systems in the ensemble. It is conveni- 
ent also to introduce a '' probability of phase 

F = I (12-23) 

such that 

J Pd<t> = I 


12.6. Ensembles and Thermodynamics. — By virtue of Liouville’s 
theorem, proved in almost all books on statistical mechanics (also known 
as the principle of conservation of density-in-phase, a name due to Gibbs), 
the representative points move in phase space as though they constituted 
an incompre.ssible fluid of varying density. A group of points filling a 
certain region of phase space at a time <o can neither contract nor expand 
during its motion; it will continue to occupy the same volume but with 
altered shape. Mathematically these statements are expressed as follows: 


dt 


“ dD , " dZ) . 

+ E — P. • + 2 — ff.- 

i-idp,- i~idqi 


(12-24) 


Thus it is seen that phase space possesses no intrinsic property of accumu- 
lating phase points in some regions or not admitting them to others; 
Liouville’s theorem shows phase space to be indifferent to the motion of the 
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points. This fact suggests the following fundamental postulate, by means 
of which contact is established between an ensemble and thermodynamic 
experience : 

The probability that, at any instant <, a given real system be found in 
the state characterized by 51 • • • ^n, Pi * • • Vn is the same as the probability 
P(pi • • • • ^n]t) that a system selected at random from the corre- 

sponding ensemble shall have the phase q\ • • • V\ ‘ ‘ Vn at the instant t. 
The probability that the values of the p^s and g^s shall lie within a small 
extension of phase A</) is proportional to A</). We are thus attributing equal 
intrinsic probabilities to equal volumes of phase space, a procedure sug- 
gested, though not made necessary, by Liouville\s principle. 

In accordance with this postulate we may calculate mean values of 
dynamical quantities of the real system by computing mean values over 
the individuals composing the ensemble. If R is such a quantity, expressi- 
ble as a function of momenta and coordinates, then 

Hit) = ^Ripi ■ ■ ■ PnlQl ■ ■ ■ qn)P{pi • • • Pn)qi ' ' ' (12-25) 

And by R{t) is meant in general the expected mean value of the quantity R 
which would be obtained when R is actually measured at the time t. It 
can be shown that deviations from this expected mean are extremely small 
when the system in question has many degrees of freedom, so that the 
expected mean may be identified for practical purposes with the value of R 
actually measured in a single observation. Moreover, we shall sec at once 
that under equilibrium conditions P is not a function of ty so that P, also, 
will not be a function of t. One may then think of P as the mean value 

of the quantity K m a temporal sense, i.e., R = I for sufficiently 

J Q 1 

large T, without violating the spirit of the postulate. 

If the thermodynamic system is in equilibrium, the number of repre- 
sentative points in any given extension in phase, A<#>, must remain constant 
in time. The condition of equilibrium may therefore be stated in the form 



For the equilibrium case, in which we are chiefly interested (a reversible 
thermodynamic change consists of a sequence of equilibrium states), 
Liouville^s theorem states 



(12-26) 


I^t us now give thought to the initial form of the function 
J^iPi • * • Pn]qi • * • ^n). We know that if it satisfies eq. (26), then it 
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implies D{t) = 0 and corresponds to an equilibrium condition of the ther- 
modynamic system. Hence D will forever be independent of L But we 
note that if we put D = D(H)y where H is the Hamiltonian function of 
the system, D will cert»ainly satisfy eq. (26) ; for the left-hand side of that 
equation will read 


dD 

Jh 



and this vanishes because of (22). Hence we take D = D(H), 

Further restrictions on D cannot be imposed on the basis of mechanical 
or statistical reasoning, except for the obvious facts that D must be every- 


where positive and must 



Dd<t> = N, 


However, the choice of the 


function must be such as to lead to the thermodynamic formulas when 
thermodynamic quantities are computed by eq. (25). The important 
choices by which this success can be achieved, as Gibbs has shown, are 
these: 


D(H) = const, when Eq < H < Eq + AE 
D{H) = 0 for all other values of H 

D{H) = 


(12-27) 

(12-28) 


where C and d are positive constants. The first is called the microcanoni- 
cal or energy shell ensemble, the second the canonical ensemble. The energy 
shell ensemble seems most reasonable from the physical point of view, for a 
system in equilibrium is one of fixed total energy, i.e., fixed within an 
‘interval of error AE, and systems not having an energy within this range 
are excluded from consideration. However, the canonical ensemble, 
although it assigns a finite density to points corresponding to those mem- 
bers of the ensemble which do not satisfy the requirement of constant 
energy, 'also leads to the correct thermodynamic relations. Since it is 
mathematically easier to handle, it enjoys greater popularity than the 
former, and was indeed preferred by Gibbs. 

The connection between the two types of ensemble may be exhibited in 
the following way. Consider a gas whose phase density in y-space is 
represented by a microcanonical ensemble. Let it consist of molecules 
with pt-spaces, ptb M 2 ) etc., with probability distribution Pi in space pti. 
Denote the element of extension in ixi by d<l>i. Since energy exchanges may 
take place between the molecules. Pi cannot be represented by a micro- 
canonical distribution; it must indeed be finite for all energies, Hi, of the 
i-th molecule. Nevertheless, the probability that molecule 1 be within 
the element d<t)i of its pi-space, molecule 2 within d<t >2 of its pt-space, etc., 



431 


ENSEMBLES AND THERMODYNAMICS 


12.6 


simultaneously, equals the probability that the whole gas be in the element 
d<l> = d<j>id(f >2 • • • d<t>y of 7 -space. Hence 

P\(fl'\.)d4>iP2(fl^d<i>2 • ■ • P,(H,)d<j>^ = P{H)d4> (12-29) 

so that 

■ P2(H2) ■ ■ ■ PXH.) = P(Hi +H2+-H,) 

We wish this functional equation to be satisfied for every value of the total 
energy H = although, of course, for any given H the constant P{H) 
may be described by the microcanonical distribution. The solution of 
eq. (29) therefore leads to a very natural extension of this distribution. 

Eq. (29) holds for every v. If the gas consisted of only 2 molecules, 

P i(Hi) • ^2(^2) = P{Hi “h H2) 

Hence it follows that P,(0) = 1 for every i. If we denote log P,- by/,, we 
have 

/i (Hi) + /2(//2) = f{Hi + II 2 ) (12-30) 

On putting H 2 = 0, this reads 

fi(Hi) -f-/2(0) = f (Hi) 

and since /2(0) = 0, /i = /. Thus all /,• are seen to be the same function, /. 
We are thus led to consider the equation 

m +f(y) =f(x + y) 

When y is taken equal to x, we have 2/(x) = f(2x), and so by induction, 

f(nx) = nf(x) 

for every integer n. From this relation, 

and 



where m is another integer. Finally, 

f(x) = f{x • 1) = x/(l) = const. X 

We have shown that the only function which satisfies eq. (30) is /»• = cHf 
whence 

Pi(Hi) - e'"' 

But Pi, being a probability, must remain finite for every Hi, a quantity 
which may tend to + qo , though not to — oo . Hence c is a negative con- 
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stant. Following Gibbs, we write for it — 1/0, so that finally 

Pi(Hi) = ( 12 - 31 ) 

This defines the canonical ensemble, in accordance with eq. (28). The 
constant C in (28) has been introduced to insure normalization in 7 -space: 

J* Pd(t> = 1 ; the functions Pi in (29) are not properly normalized in each 

^-space, as is evident on closer inspection. 


Problem. Consider as system a single particle of mass m in a constant gravitational 
field. Note that the microcanonical ensemble is given by Fig. 1 , where all points not 
lying between the two parabolas A and jB, corresponding to H = .£^0 and H = 
Eq + have zero density. Show that the group of points lying between pi and 
P2Sitt = 0 , will lie between and p2 at time <, such that Pi = pi -h rngt^ P2 = P2 + 
Prove also the invariance of the element of phase volume, i.e., area = area <^2- 
(Liouville’s theorem.) 



12.7. Further Considerations Regarding the Canonical Ensemble. — 

As an illustration regarding the use of the canonical ensemble we derive 
the Maxwell law for the distribution of velocities in an ideal gas. In 
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'ix + Viy + viz 


+ V (x, •?/,■: 


accordance with eq. (28) we put 

Pd(i> = Ce~^^^dpidp2 • • • dpndqidqa • • • dqn 

V 

= Ce-^'^ n dpixdVi^AVizdxa^yi^i 

t-1 

The constant C must be so chosen as to make J Pd<t> = 1. For an ideal 

gas, 

jr \Vix + Viy + Viz . rr 

= z ^ — + y{x^yiZi)^ = 

where V {x^y^z) is the potential energy of a particle in an external field if 
such a field is present. The probability that particle i have an energy Hi 
corresponding to Viy) Viz\ Vd regardless of the states of all 

other particles is clearly given by the integral J Pd(t> extended over the 

momenta and positions of all particles except i : 

PrdpixdViydVizdXidytdzi = c'e~^^^^^d<t>i (12-32) 

being some other constant. This relation, often called the Maxwell- 
Boltzmann law, is really nothing more than eq. (31). When no external 
field V is present, it may be written in more explicit form, for the constant 
c' can then be determined. Since 


we have 


J c'e~“*'^d4>i = 1 

^ ////// e~^l-^^l'^Pt'>^^”^dpxdPvdp,dxdydz 


dpxdpi/dpt 


where t is the volume of the gas. Thus 




e-^^l2m»du 


)3/2 . ^3/2 


When (32) is now expressed in terms of velocities instead of momenta and 
an integration over the volume is carried out on both sides, the result is 

P{vx,vy,v,)dvxdv^v, = (2xm0)“^^^e“^'’*'^'’»‘^'**’”''^V(myi)d(wit;y) ■ d(mv^) 

= ( e-^^x+’^y+4^rn/2edvxdvydv, (12-33) 

\2'!r6/ 
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To find the meaning of the parameter 6 we compute the mean energy of 
the t-th particle, Hiy which, as we know from simple statistical theory, 
must be equal to ffeT. We have 

^ f f viy' 

= ^ J J diidvdw 


If this is to be equal to fA-T, we must put 

d = kT (12-34) 


Making this substitution in (33) we obtain Maxweirs law for the dis- 
tribution of velocities 

( \3/2 

-—\ e (12-35) 

The probability that the absolute value of v shall lie between v and v + dv 
is derived from this expression by transforming the volume element 
dvxdvydvg to spherical coordinates, where it takes the form v^dv sin ddOd^p, 
and then integrating over 6 and <p. Thus 

P{v)dv = (12-36) 

\27r/ci / 

According to its derivation P{v) denotes the probability that one 
molecule shall have a speed about v. It is then clear that vP{v) represents 
the number of molecules having this speed. It is this last interpretation 
which is usually given to MaxwelPs law. 

For most purposes it is convenient to write the canonical distribution 
law, eq. (28), in a slightly different form. When we put C, the positive 
constant occurring in that equation, equal to where ^ is a new parame- 
ter depending on 6, we have 

P{H) = (12-37) 

the standard form used by Gibbs. 

In conclusion, let us attempt to correlate the quantities H, and B 
with thermodynamic quantities. This can be done through the thermo- 
dynamic relations, the most important of which are: 

dU = TdS - 'LSidii ( 12 - 38 ) 


and 


-dA = SdT + 


( 12 - 39 ) 
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Here U stands for the internal energy of the system, S for its entropy, A 
for the Helmholtz free energy. The force fi is defined by fi = - dH/d^iy it 
is called into play when the i-th external coordinate is changed. The sum- 
mation represents, therefore, the total work done by the thermo- 

dynamic system when it undergoes a (reversible) change involving varia- 
tion of the {i. 

We now return to the ensemble whose distribution is given by (37). 
This distribution will change in detail as the condition of the system 
changes. But it will change in such a way that 

J = 1 

On differentiating this relation (permitting the external parameters (i as 
well as 6 and hence \p to be altered) we have 

- *-^e - Irf y 

dx!/ log P 1 ^ - 

= — - dd + = 0 (12-40) 


provided we use eq. (37) and indicate averages over the ensemble by hori- 
zontal bars, i.e., ^ ^ J' PQd<t>. The last equation may be written 

de + ( 12 ^ 1 ) 

t 

But since log jP = , we also have 


xP = dlogP + n {12-42) 

d^ = dH + 6 d(iogP) + iogT de 


When this is substituted in (41), the result is 

dH = -0d(biT) - i:7.de.- (12^3) 

i 

Now it is clear that dH must be identified with the increase in total energy 
of the thermodynamic system, dU, We have already established the 
relation 6 = kT. Furthermore, the Ji can hardly be anything other than 
the actual forces acting on the real system. We then see that (43) is the 
exact analogue of the thermodynamic relation (38), provided we interpret 
~ log P as entropy divided by Boltzmann^s constant, k. 

When eq. (41) is now compared with (39), xp is at once seen to be the 
Helmholtz free energy, A. With this additional interpretation, eq. (42) 
becomes the familiar 


A = U -TS 
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Further pursuit of this matter shows that all thermodynamic relations are 
satisfied if we correlate thermodynamic with statistical quantities as 
follows : 


Total energy (U) corresponds to 71 
Absolute temp. (T) corresponds to 9/k 
Entropy (S) corresponds to —A log P 
Helmholtz free energy (A) corresponds to ^ 


(12-44) 


When A is given, many thermodynamic properties of the system at hand 
are known (cf. Chapter 1). It is therefore important to know how to com- 
pute the free energy, i.e., yp, statistically. Since J = 1, we 

have = J The integral J * = /, which is thus 


seen to be basic in the evaluation of is often called the phase integral 
(also sum of state and partition function In terms of it 


\[/ = — fcT log 7 


Problem. Using (36), verify the following relations for the first and second 
moments of the velocity distribution of an ideal gas : 


11^1 


_2 /2kTV^^ 
Vw\ ^ / 


- skr 

= 

m 

Show also that the most probable velocity is 


12.8. The Method of Darwin and Fowler, — A statistical method differ- 
ent from that of Gibbs but also leading to the correct thermodynamic 
laws, and more adaptable to the needs of quantum mechanics, has been 
introduced by Darwin and Fowler.^^ We shall first describe its funda- 
mental features and then use it to derive quantum mechanical distribu- 
tion laws. Consider a system made up of v similar particles. No refer- 
ence to an ensemble will here be made; all arguments concern this single, 
real system. If the particles are independent^ as will now be assumed, each 
individual particle may be said to be in a definite energy state €», this ti 
being an eigenvalue of the Schrodinger equation (see Chapter 11) for the 
single particle, with boundary conditions corresponding to the volume of 
the total system if the latter is a fluid, or other suitable conditions if it is a 

Phil. Mag. 44 , 450, 823 (1922); 46 , 1, 497 (1923). For a general and more 
recent treatment see Fowler, R. H., “ Statistical Mechanics,” Second Edition, Cambridge 
University Press, 1936. 
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crystal. By the state of the total system we mean the aggregate of single 
particle states, that is the assignment of individual particles to the various 
energies €{. But a microscopic assignment of particles to energies, as in the 
statement: particle 1 has energy €t, particle 2 has energy €y, particle 3 has 
energy e^, etc., has no meaning in quantum mechanics in view of the exclus- 
ion principle. The best that can be done in specifying a state is therefore 
to say that a\ particles have energy ei, a 2 particles have energy € 2 , • • • a^ 
particles have energy etc. Thus a state is defined when a system of 
occupation numbers,’^ ai, a 2 , • • • a, is given. These must obviously 
satisfy the relation 

= V (12-45) 

% 

We shall also prescribe that the system shall have a fixed total energy E, 
so that 

= E (12-46) 


Now it is possible, as will be shown in the next section, to assign a 
statistical weight, w{ai • * • a^) to each state ai • • • a^. The average of a 
quantity Q(ai • • • a^) which takes on different values for different states is 
then defined by 


Q = 


J^wjai » » » as)Qiai ♦ - a J 
T,^(ai • • • aj 


(12-47) 


The summations in this expression are understood to be taken over all 
values of ai, a 2 , a^, etc., which satisfy conditions (45) and (46) ; the index s 
is in general very large; it is given by €« ^ E, €s^i > E. Contact with 
experience is made in the Darwin-Fowler theory by assuming that Q is the 
observed value of the quantity Q when a measurement is made on the 
system. The Gibbsian ensemble average is here replaced by an average 
over the states of a single thermodynamic system. The fact that they 
agree is rather noteworthy from a logical point of view. To carry through 
the calculation of an average like (47) it is necessary (a) to construct a 
suitable weighting function, iv; (b) to devise means for evaluating the 
restricted sums appearing in that equation. 

12.9. Quantum Mechanical Distribution Laws. — In quantum mechan- 
ics, the weight of an energy state is defined as its degree of degeneracy: it is 
equal to the number of linearly independent state functions belonging to 
the eigenstate in question. This postulate will here be invoked. Our 
system, however, is one containing v similar particles ; hence it is necessary 
to apply all the considerations of secs. 11.32 and 11.33, in particular the 
exacting demands of the exclusion principle. But for the moment it seems 
well to consider the number of eigenstates of E belonging to the statistical 
state (ai • • • aa) when the exclusion principle is left out of account. 



12.9 


STATISTICAL MECHANICS 


438 


As an example, consider the simple case of 5 particles, with energy 
partition oi = 2 and oa 3. Let the single-particle function belonging 
to the single-particle energy «i be f i, that belonging to « 2 , The Schro- 
dinger equation for the 5 particles corresponding to £/ = 2«i -f 3ea will 
then be satisfied by the simple product 

^i(1)^i(2)^2(3)<^2(4)iA2(5) (12-48) 

as well as by any function obtained from this product through permutation 
of the arguments 1 to 5 (each numeral designates all coordinates, including 
the spin, of the corresponding particle). But not all the 5! products thus 
obtained are independent. For instance a transposition of 1 and 2, or a 
permutation among particles 3, 4 and 5 causes no change in the function. 
The number of different combinations is obviously equal to the number of 
ways in which 5 objects can be arranged in 2 piles, one containing 2 the 
other 3 objects. This, according to eq. (1), is 

5! 

2!3! 


The generalization of this result is immediate; the number of different 
energy eigenfunctions belonging to (aia^ ■ ■ ■ a,) is given by 


w(oi ■■■ a,) 


v\ 

ai !a 2 ! • • • a, ! 


(12-49) 


This is true so long as each individual particle function, is non- 
degenerate. Suppose now that the energy «, itself can be realized by g,- 
different functions. Each then has a weight gf„ and the product of o,- 
^uch functions has a weight gl'. We then obtain, in place of (49), the more 
general result 


w;(ai - a,) 


= 1 -! 


a\\a2\ • • • o,! 


(12-50) 


To see this detail, let us return to the example (48) and assume gi = 3, 
92 = 2. It is then necessary to introduce new functions, e.g., 5, c, d in place 
of e and / in place of ^ 2 - Instead of ^i(l)^i(2) we can now have 

6(1)6(2), c(l)c(2), d(l)d(2), 6(l)c(2), 6(2)c(l), 6(l)d(2), 
6(2)d(l), c(l)d(2), c(2)d(l) 

and in place of 

e(3)e(4)e(6),/(3)/(4)/(6), 6(3)/(4)/(6), «(4)/(3)/(6), e(6)/(3)/(4), 
6(3)e(4)/(5), e(3)e(5)/(4), e(4)e(6)/(3) 

Eq. (60) would be the statistical weight of the state oi • • • a, if no symmetry 
requirements, no Pauli principle, had to be respected. 
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How many different functions can be constructed out of the individual 
iffi when overall antisymmetry is demanded? We have seen that the only 
antisymmetric function available has the determinantal form eq. (11-160'). 
This, however, vanishes when any two particles are described by the gama 
for then two rows of the determinant are equal. Hence, if there is no 
degeneracy in the individual particle functions (all gi are 1), only one func- 
tion is constructible; in other words. 


w{ai • • • o.) = 


1 1 if every o,- ^ 1 
1 0 if any a, > 1 


On the other hand, if the ^-th state has a degeneracy </, > 1, the num- 
ber of non-vanishing determinants which can be constructed is equal to the 
number of ways in which a, arguments can be distributed among g, differ- 


ent functions, and this, by virtue of eq. (2) is 



Thus we obtain in 


general, when the exclusion principle is applied, 


w(ai • ■ ■ a,) 


(::)(:)■(:) 


(12-51) 


Note that this vanishes when any a< is greater than its corresponding g^, so 
that the preceding equation is a special case of this. 

Finally, we consider the case in which the total function is symmetrical. 
As was shown in sec. 11.33, this is of the form where is a function 

p 

constructed like (48), with a particular permutation of the arguments 
1 to V. But if any permutation of arguments is made in this func- 

p 

tion is transformed into itself. Hence i(;(ai • • a«) is always 1 provided all 
g^ are 1. But if this is not true, then the degeneracy of gives rise to as 
many different combinations of functions as there are ways 

of distributing the Gi arguments amongst them, without regard to the 
number of arguments associated with the same function. This number, 

I 

by eq. (5), is ( 

\ di 

metrical case to be 

»(a, (.2-52, 

Assemblies of particles whose motion is governed by the Pauli principle 
must be described by antisymmetric functions. Their statistical weights 
are given by (51). The formulas which ensue from its use are characteris- 
tic of Fermi-Dirac statistics, the type of statistics to which electrons, 
neutrons, protons are subject. Henceforth we refer to (51) as Wp,o.- On 




We have thus determined w for the sym- 
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the other hand, photons, nuclei and atoms containing an even number of 
elementary particles (e.g., He^) are known to require symmetrical state 
functions for their collective description. Their statistical states have 
weights given by (52); they are said to obey Einstein-Bose statistics. 
Henceforth we write We.b. in place of (52). No known constituents of 
material bodies are described by (50), although it is precisely that assign- 
ment of weights which leads to the Maxwell-Boltzmann law. It will be 
shown, however, that both quantum formulations, (51) and (52), give rise 
to distribution laws which under many thermodynamic circumstances are 
practically identical with the Maxwell-Boltzmann law. For this reason, 
and for the sake of generality, we continue to include eq. (50) in our con- 
sideration, and refer to it as (classical assignment of statistical weights). 

It is to be noted that all three statistical weights may be written in the 
form 

w = lly{aj) 
j 


if we put 


\ 





(12-53) 


In proceeding thus the factor v\ in eq. (50) is being omitted. However, 
since this factor is independent of the a^s and hence constant for all statisti- 
cal states, it will cancel when averages are computed after the manner of 
eq. (47). This is the principal use which will here be made of the weight 
function. In many other problems the omission of v! is not permissible. 
(See the remarks after eq. 72.) 

The quantity Q whose average we wish to calculate is ar, the number of 
particles having a given energy Cr. It is necessary, therefore, to evaluate 


Or 


2^0^117 (Oj) 
(g) j 

Ln^Cay) 

(a) 3 


A 

W 


(12-54) 


W is here written for the sum of all statistical weights compatible with 
the fixed energy E, A for arW. The summations appearing in (54) are 

The formula for Wc can also be derived as follows, without reference to state 
functions. Divide phase space into cells according to the energies of the individual 
particles: in the i-th cell a particle has energy €». If the i-th cell has a fundamental 
weight Qi, then the number of ways in which the state ai, 02 * • • a, can be realized by 
assignments of specific particles to cells is given by eq. (50). 
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taken over all values of ai, a2, which satisfy both (45) and (46). 

We first calculate W, 

For this purpose consider the expansion 

r 2a/ Za/*/"] 

M - 2 n(7(ay)}a:^ (12-55) 

aiai • • • a, L ; J 


in which the summation over the a's is entirely unrestricted, each one of 
the many a's taking all values from 0 to oo . M may be regarded as a func- 
tion of X and 2, depending parametrically upon the eigenvalues €; character- 
istic of the particles in question. A moment^s reflection will show that 
W is the coefficient of x''z^ in the expansion M ; in other words, because of 
the theorem of residues eq. (3-3), 


W = 



Mdxdz 


(12-56) 


the integrals' being taken counter-clockwise about the poles of the inte- 
grand, i.e., about x = 0 and 2 = 0, x and z being considered as complex 
variables. 

Now M may be evaluated rather simply. First note that it can be 
written 

ai — a, j 



^ y{n)x^z^"^ 

n =*0 


The summation in { } can be performed for all three of the functions 7 

00 

listed in (53). Let us put xz*‘ = r, Z 7(n)r” = /(r). We then obtain 

n =0 


fe = T, = exp (g,r) = exp {gyxz"} 

An. = £ r" = (1 + ry- = (1 + xz")"' 

n=o\n/ 

fE.B. = L ~ ^) = (1 - xz^T"^ 

n =* 0 \ ^ / 




y (12-57) 


The last result, which is perhaps not so obvious, is easily verified by writing 
down the MacLaurin expansion of 


(1 - r)-» = 1 + r + 


g , + 1) 2 , g^g + 1)(^' + 2) 


2 ! 


r" + 


3! 


+ 


which is identical with the summation in /^.s,. 
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Thus we see that 

M - Ufixz**) (12-58) 

; 

and W is related to M by eq. (56). The effect of the degeneracy factors Qj 
in (57) is rather interesting: they merely appear as exponents in the /-func- 
tions in all three cases. 

Let us now consider the numerator of eq. (54). If we calculate the 
quantity (1/ln z)(dM/der)y we find> using eq. (55), 


1 

In z de 


ttn*-ag ^ y / 


L aJn7(ay)x»>2»>‘n 

0102— a* ; J 


On comparing this with (54) we see that A is the coefficient of x*'z^ in the 
expansion of (1/ln 2)(^M/^€r)- But this last quantity can be put in a 
form more suitable for our purposes. In view of (58), 


i dM I { M df(xz^) , , ' 

In 0 d€r \n z\f{xz^^) dixz^’^) 
d 

= Mx — In f{xz*^) 


Summarizing these steps, we note: 


Mdxdz 


(12-59) 


If now it were possible to find a path of integration around the origin 
with respect to both x and z, such that the function were 

practically zero everywhere along that path except in the immediate 
neighborhood of two definite points, say x = ^ and z = i?, the evaluation 
of Sr = A/W would be very simple. For it would then be permissible to 
take the factor x(d/dx) In/, multiplying the integrand in (59), in front 
of the integral sign and give to x and z the values { and t?, and the integrals 
themselves would cancel. We should then have 


Sr = 


This procedure is indeed justified, as the following section will show. If 
the /-functions are identified in accordance with (57), the result is seen to be 




(Sr)c = 


(Sr)F.D. 






(flr)^/^ = 
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The physical significance of the parameters { and can be fixed most simply 
by the subsequent plausibility argument which we offer in lieu of more 
detailed considerations^^ adducible to establish this meaning more com- 
pletely. The first of the expressions above must be the Maxwell-Boltz- 
mann law which reads, for the situation here considered, Ur = 
the factor i4o being so determined that = p. Hence J must be identi- 

r 

fied with i4o, and i? with In the other two relations ^ must also act 

as a normalizing factor, while has the same meaning as in (ar)c- Hence 
we conclude 

(ar)c = ' (a) 

(ar)f.i) = ^1 Y ^-*rikT ^ (b) (12-60) 

0r)EB. = pi __ ^-irlkT (c) 

It is easily seen in a qualitative way that f must increase when v in- 
creases (the volume of the system being fixed) if J^ar is to remain equal 
to V. In (a), f is in fact proportional to p, but this simple dependence 
fails in (b) and (c). Nevertheless, if p is very small, I > 

In this case both (b) and (c) reduce to the classical form (a). Hence for 
sufficiently small densities all assemblies show an essentially classical 
behavior. Closer investigation (see any of the references at the beginning 
of this chapter) indicates that this is true for all ordinary molecules at 
ordinary temperatures, thus justifying the use of classical statistics. 
The main instances in which quantum distribution laws are needed are the 
motion of electrons in metals (b), the photon gas, and helium at very low 
temperatures (c). 

All thermodynamic relations can be deduced by the method here 
described provided the following associations between thermodynamic 
quantities and elements appearing in the Darwin-Fowler scheme are 
made; the first is obvious, the second has already been obtained; the 
others will be derived later on (cf. eqs. 71 and 72): 

U corresponds to E *| 

T corresponds to — (fc In t?)~^ I (12-61) 

S corresponds to A; In IF J 

vA corresponds to --kTCL — p In {) 

j 

12.10. The Method of Steepest Descents. — The evaluation of Ur in the 
last section depended for its validity on our ability to find a point x = { in 
Cf . Fowler, loc. cit. 
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the integration with respect to and another point 2 = in the integration 
with respect to z, at which the integrand of (56) would be large, and 
where its value would descend very steeply on both sides along the path 
of integration. Such a point has interesting properties which we shall 
first investigate in connection with a simpler but more general example. 

Let ^(z) be a function in the complex plane, z being x + iy. We wish 
to find a point in the X, F-plane such that, as we cross that point in the 
direction of steepest descent of <^, will be a maximum at the point. To 
be more specific we will refer in this inquiry not to v?, but to the reoX part 
of ip. Suppose the point thus defined is z = t?. 

On writing 

(^(z) = g{x,y) + ih{x,y) 

where g and h are real functions, it is clear that these must satisfy the 
Riemann relations: 

Qx = hy, gy = (12-63) 

Our specification amounts to this: dg = gxdx + Qydy shall be zero along 
the path on which g decreases most rapidly. The direction of this path is 
the direction of the negative gradient of g, namely — Agf. Since this is 
— (igx + igy)j this direction is defined by dy/dx = gy/gx- But by reason 
of eq. (63), this is —hxihy. On the other hand, if dy/dx = ’-hx/hyy then 

hxdx + hydy = dh = 0 

in the same direction. Now the vanishing of both dg and dh at t? is possible 
only if (p\^) = 0. We may conclude, therefore, that the point in question, 
if it exists at all, satisfies the condition 

ip' = 0 

If ip' has a real root, the point ?? will obviously lie on the real axis. 

Next, it will be shown that t? is a saddle point/^ i.e., that the curvature 
on the path of steepest descent is opposite to that along a path at right 
angles to this direction. For any direction dy/dx 

+ 2gx^xdy + gy^y^ 

By Qx is meant dg/dx^ etc. To prove these well known relations we observe 
<Px - fp' - Qx •¥ ihx 

iPy — itp' ^ Qy + ihy 

with <p' = d<p/dz. Hence <p' — Qx + ihx = —igy + hyy which is equivalent to eqs. (63) 
when real and imaginary parts are equated separately. Note also that (63) implies: 
gxx 
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The direction at right angles to dyidx, is fixed by the substitution of ~dy 
for dx, and of dx for dy. Hence, on writing for the curvature at right 
angles to the direction dyjdx, 

= gxxdy^ - 2gx^dy + Qyydx^ 

SO that 

d^g + d^gj_ = {g^x + gyy) (dx^ + dy^) = 0 
in view of eqs. (63), 

With this general knowledge, let us return to the calculation of (56) : 

M = Tlf{rj), Tj = xz‘^ 
j 

Let us f urther put 

Mx-'z-^ = 

so that 

F = 2 111 /{'>'}) — v\nx — E\n.z 

3 

A saddle point of the integrand of (64) is then determined by 

dx 

in the integration with respect to x, and by 

^ = 0 
dz 

in the integration with respect to z. The first of these leads to 


(12-65) 


the second to 


) f{Pi) « 


I f'ipj) E_ 

5 f /(p,) “ 


(12-66) 


(12-67) 


where py has been written for Eqs. (66) and (67) define the saddle 

point ({,t>) in X — Z space: 
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These relations at once take on a more interesting form when we insert 
the value of /'// in accordance with (57), for they become 


in classical statistics; 


in F.D. statistics; and 






1 - ~ 
^1.9 j = E 

in E.B. statistics. All these simply read 


Da, = V 
} 

Tfijtj = E 

J 

if the result (60) is used. 

• In the following we shall also need the values of d^Y/dx^, d^Y/dz^, and 
d'^Yjdxdz at the saddle point. To save writing the discussion will be 
limited for the present to F.D. statistics. Here one finds^'* 


d^Y 
dx^ (.a 

d^Y 

dz^ 

\d'^Y 

dxdz 


i V = - 

^ TT+i^ id 


The quantities Ay By and C are to be defined by these relations; it is 
easily seen that, for a gas consisting of many particles, they are very large 

Note that the symbol A has a different meaning than in the last section ! Neither 
is it to be confused with the Helmholtz free energy 
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numbers.*® Moreover, the reader will be able to show that 

AB > C* (12-68) 

always. 

Having located the saddle point, we expand Y (z,z) in its neighborhood. 

r(x, 2 ) = Y{^,6) + ^ (x - I) + ^ (2 - ,?) (12-69) 

of ou 

+ (:«;-«)" + 2^ (T - i)iz - d) + ^ (2 - ^)2j + . . . 

The first derivatives vanish at the saddle point, which, according to eqs. 
(66) and (67), lies on the real axis of both x and z. In order to carry out the 
integrations in (64), it is suggested that the paths be taken across f and t?, 
and this is done with greatest convenience by choosing the circuits 

X = —TrKa^TT] z = —w < ^ w 

When these substitutions arc made in (69), this expression becomes 
Y{x,z) = F(f,t?) - + 2Cal3) 

in the neighborhood of f, t?, since for small a and ^ 

X — f = i^a and z — -d — 

Therefore, in view of (65), 

g~(l/2)Ua2+^/32+2Ca/3) 

and 

W=(2rir^JJ (tda) -(id^) (12-70) 

This result shows with impressive clarity how rapidly the integrand 
“ descends from its saddle point: its “ half width ” with respect to a for 
example, is approximately given by But A is of the same order of 

magnitude as the total number of particles. The procedure of the fore- 
going section was therefore proper. 

The question arises as to the behavior of the integrand at points on the 
contour not in the neighborhood of the saddle point, for we hardly have 
reason thus far to expect that it is small everywhere else. This, however, 
is not difficult to prove. When written in terms of a and /3, the function / 
of (57) takes the form 

fpD, = (1 + a + ffe, 

The €j must here be regarded as dimensionless and of order of magnitude unity 
or greater. This can be achieved by measuring kT and ej in the same conveniently 
chosen unit, in which case kT and hence t?, also, become pure numbers. 
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and M becomes 

M = 11(1 + 2ft? V cos Uj + exp 

J 


1 + $t?'^cosu,-J 


The product of all exponential terms, which are purely imaginary, can never 
exceed unity, the value which it has at the saddle point. Each term in the 
first parenthesis attains its maximum when Uj = 0, or in general 2nir. If 
Uj = 2mr for a given choice of a and d there will be very many Ukj k j, 
for which the parenthesis will not assume its maximum value, so that the 
product, having a great number of factors, will be much smaller than its 
value at The only way to insure that M will be a maximum is to 
make Uj — 0 for all and this requires that both a and ^ be zero. This 
maximum will be very strong provided the number of energy states, €y, 
is large. 

It is of some interest, finally, to conclude the explicit calculation of W» 
The integral in (70) is easily performed, for the limits may clearly be re- 
placed by + and — oo. Remembering the formula 


we find 



W = 


2WAB - 


which, in view of (68), is real and positive. The entropy, defined in (61), 
becomes therefore 

S = kY{^,e) - kin [2wVAB - C^] (12-71) 

In chemistry, it is customary to neglect the second term of S because it is 
much smaller than the first when the number of particles is large.^® Now 

y({,t?) = In M($,t?) — j/ In f -* E In 

= Z In /(!.?*')- "Inf + 

In view of (71), then, 

E - ST -kT(Z In / ~ v In 0 (12-72) 

) 

This justifies the identification of the free energy made in (61). 

Finally let us endeavor to make contact with classical statistics again. 
Here it must be remembered that a factor v\ was omitted in the evaluation 
of W, We must therefore add to (71) the quantity k\nv\, so that we have 

The full expression must be used when attention is given to the entrc^y of a 
nucleus, which contains relatively few neutrons and protons. 
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in place of (72) for the classical free energy .4cia88- 

^ • -4c1»m = —kT(^ In / — r In { + In vl) 
) 


But 


by eq. (66') and 


Z In / = 1 / 


In p! « p In V — j> 


by Stirling’s theorem. Hence 



Again by (66'), v/^ = a quantity to be denoted by Z and often 

3 

called the 'partition function. In terms of Z, 

^ciaoe = -kT\n Z 


Comparison with the last equation of sec. 12.7 shows that Z is the quantum 
mechanical analogue of Gibbs' phase integral. 

The computational aspects of the statistical method can be simply 
summarized as follows. Given a system of v particles whose total energy 
is E, Each particle has energies ej obtainable by solving the Schrodinger 
equation with boundary conditions corresponding to the volume in which 
the particles are enclosed. For instance, if the volume is a parallelepiped, 
the €j are given by eq. (11-36). Thus they depend on the volume of the 
container. 

The thermodynamic properties of the system then depend on two 
parameters, { and t?, defined by eqs. (66') and (67'). When these are 
solved simultaneously and ^ and t? are known for the given v and E, the 
quantities T, S and A can be calculated from (61). In F.D. and E.B. 
statistics, eqs. (66') and (67') are such that there is no general method for 
obtaining explicit solutions for ^ and t?. Recourse must then be had to 
approximations, valid in different ranges of the parameter 

Problem. Find the values of A, C in classical and in E.B. statistics. Note that 
the classical values are obtainable from those derived in the preceding section by 
letting {—>0. 

Partition functions for specific substances can often be computed from spectro- 
scopic data. Such calculations are becoming increasingly important in Applied Thermo- 
dynamics. See Wilson, E. B., Chem Rev., 27, 17 (1940); Also Mayer and Mayer, 
loc. cit. 

See for instance Fowler, loc. cit. 



CHAPTER 13 


NUMERICAL CALCULATIONS 

13.1. Introduction. — We describe here certain types of numerical 
calculations which are often required. No theory^ is presented, but the 
methods are explained and illustrated by means of worked examples. 
The reader will find that such computations are usually tedious and time- 
consuming, hence he should exercise his ingenuity in devising means of 
reducing the labor involved. Before starting a calculation, he should 
always consider the possibility of using graphical methods, for these arc 
often simpler than the numerical ones. He should also remember that 
there is some advantage in representing numerical data by equations of 
empirical or theoretical form. Such equations, obtained by the method 
of least squares or otherwise (see sec. 13.37) are generally easier to use for 
interpolation, differentiation or integration than the methods of this 
chapter. Finally, he should note that when alternative procedures are 
given for a particular operation, the special problem at hand may often 
suggest which of these is the most suitable. 

It is assumed that the reader is familiar with the elementary facts 
concerning significant figures, rounding off and number of significant 
figures to be retained in addition, multiplication, etc.^ 

For convenience, we divide this chapter into three separate parts. 
The first deals with methods primarily based on interpolation formulas; 
the second, with miscellaneous algebraic calculations and the third with p 
discussion of errors and related problems. 

PART 1. NUMERICAL METHODS BASED ON 
INTERPOLATION FORMULAS 

INTERPOLATION 

13.2. Interpolation for Equal Values of the Argument. — It often hap- 
pens that data are given in tabular form with values of x and y = f(x) at 
certain intervals of x. Suppose a value of y is needed for an x, which is not 

^ For the theory involved and more details of some of the methods, the following 
references may be consulted: Scarborough, J. B., Numerical Mathematical Analysis,'’ 
The Johns Hopkins Press, Baltimore, 1930; Whittaker, E. T., and Robinson, G., ** The 
Calculus of Observations,” Second Edition, Blackie and Son, London, 1932; Runge, C., 
and Konig, Numerisches Rechnen,” J. Springer, Berlin, 1924. 

^ These things are discussed by Grumpier, T. B., and Yoe, J. H., Chemical Com- 
putations and Errors,” John Wiley and Sons, New York, 1940, or Scarborough, loc. cit. 
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listed in the table. Usually, the simplest procedure is to plot y against x, 
draw a smooth curve through the points and read from the graph the 
required value of y. The same result may be obtained by the use of 
interpolation formulas. Provided the given values of x are equidistant, 
we first form a difference iable^ as shown in Table 1 where the first, second, 
third and r-th differences are given by 

Ayo = 2/1 - 2/0, Ayi = 2/2 - 2/1, • • •, ^2/n-l = 2/n - Vn-l, 

^yn = 2/n+l - 2/n 

A^2/o = Ayi - Ayo = 2/2 - 2yi + yo, ■ ■ ■ 

A^yn = Ayn+i - Ay„ = i /„+2 - 2yn+\ + 2/n 

A®2/n = A^2/n+l - = y^+z - 3i/„+2 + 3yn+t - 2/n 


7* (t* 1 ) 

A’‘2/„ = A’^"Vn+l - A’^Vn = 2/n+r - r2/„+r-l H 2/n+r-2 H 

+ i-lYVn 


r 

= L 

m -=0 



(13-1) 


TABLE 1 


X 

y 

A 

A- 

A’ 

A' 

A' 

A* 

(To 

yo 







Xl 

yi 

Ayo 






X2 

2/2 

Ayi 

aVo 





xz 

2/3 

Ay2 

A^JH 

A’yo 




X4 1 

2/4 

Ays 

A*J/2 

a’vi 

A^yo 



Xb 

yh 

Ay4 

A^ys 

A’y2 

A^i 

A‘yo 


xz 

yo 

Ayb 

Ahji 

A’ys 

A*y2 

A®yi 

A*yo 


In forming such a table of differences, care must be taken to maintain the 
correct signs ; the subtractions must all be performed in the order given in 
(1). A convenient check may be obtained by noting that the sum of the 
entries in any column equals the difference between the first and last 
entries in the preceding column. It also happens in most cases that the 
differences of some order will be zero or will vary (perhaps with alternating 
signs) only in the last few figures of the numbers retained. This is the 
basis for all of the methods described in the first part of this chapter, for if 
the imknown/(a;) were a polynomial of the n-th degree, the n-th differences 
would be constant and the (n + l)-th differences zero. 

* Many different notations and forms of the difference table will be found in books 
on numerical methods but it will usually be simple to find the relations between the 
various symbols used. 
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Now if Xk and yk are values given in such a table, h is the common 
interval olx,h = xi — xq = — xi = ■ • • = Xn — and 


X = Xk + hu; u 


(X - Xk) 

h 


(13-2) 


then a value of y for an x not contained in the table is given by NewUm’a 
interpolation formula, 


y = yk + u^Vk + 


u(u — 1) 
2 ! 




m(m - 1)(m - 2) 3 

+ 


31 


, ■■«(« - 1)(m - 2) • • • (m - r + 1) ,, 

H ^Vk 


(13-3) 


A second useful form of this equation may also be obtained: 

, ^ ,w(w + l)^2 , w(m + l)(w + 2) ,3 

y = yk + uAyic-i H ^ yk-2 d ^Vk-a + 


3! 


, u{u + 1)(m + 2) • • • (m + r — 1) 

+ ^ W 


(13-4) 


It will be noticed that (3) involves differences lying on a diagonal line in the 
table, starting from while (4) uses differences on a horizontal line from 
?/fc. Thus (3) should be used for interpolation near the beginning of a 
difference table and (4) for interpolation near the end. Summation should 
be continued until the desired number of significant figures is obtained. 
These two formulas may also be used to extrapolate at both ends of the 
difference table but due caution should be used in such cases unless it is 
known that the function is continuous beyond the tabulated values. 

Example 1. Interpolate in Table 2, to find y = e~^ for x = 0.0477. 
Wetakexjfc = 0.05, thus = 0.05, w = —0.046. Using (3), 


y = 0.99750 + 4.6 X 7.45 X 10“® - - X 10“® 


4.6 X 1.046 X 2.046 X 1.9 


X 10" 


0.99750 + 0.00034 - 0.00012 = 0.99772 


It will be noticed that the third and fourth differences are too small for 
consideration The result is correct to the last figure given as may be 
found by expanding in a power series. In this case, the calculations 
may easily be performed with a slide-rule. 
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TABLE 2^ 


z 

y = 

A 

A* 

A» 


0 

1.00000 





0.05 

0.99750 

- 250 




0.10 

0.99005 

- 745 

-495 



0.15 

0.97775 

-1230 

-485 

+10 


0.20 

0.96079 

-1696 

-466 

+19 

+9 

0.25 

0.93941 

-2138 

-442 

+24 

+6 

0.30 

0.91393 

-2548 

-410 

+32 

+8 


Example 2. Calculate y = c for x = 0.2862. Since this value is 
near the end of the table, it is better to use (4) with Xk = 0.30, u = —0.276. 
Then, 

4 1 y 2 76 y 7 24 

y = 0.91393 + 2.548 X 2.76 X 10"® + ^ X 10“® 

3.2 X 2.76 X 7.24 X 1.724 _ 

n X 10 

O 

= 0.91393 + 0.00703 + 0.00041 - 0.00002 = 0.92135 


This result is also correct to the last significant figure. 

An arrangement of tabulated data, somewhat different from that of 
Table 1, leads to central difference formulas, notably those of Stirling and 
Bessel. While these converge faster than Newton^s formula, this advan- 
tage in most cases is of no practical importance.® 

Problem. Interpolate or extrapolate from the data of Table 2 to find y = 
for X = 0.045; 0.2775; 0.3018. 


13.3. Interpolation for Unequal Values of the Argument. — When the 
values'^of x are given for unequal intervals, (3) and (4) do not apply, but it 
is possible to use divided differences or the interpolation formula of Lagrange. 
Both methods are tedious to apply and not very precise, hence it is usually 
better to interpolate from a suitable graph. We give Lagrange’s formula 
only; for the method of divided differences, Whittaker and Robinson 
(loc. cit.) may be consulted. Suppose xq, Xi, • • •, Xn and yo^ j/i, • • •, i/n 

* Following the usual custom, we omit zeros after the decimal point in the various 
differences. 

® For details concerning central differences, see Scarborough or Whittaker and 
Robinson, loc. cit. 
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are knowa, then for some other value of x, 
(X — Xi)(x — Xi) 


y = /(*) 


+ 


(lo - a!i)(ro - * 2 ) 
{x — ro) (x - X 2 ) • 


(x - x„) 


' (xo — x„) 
(x - Xn) 


(Xl - Xo)(Xi - X2) • • • (Xi - Xn) 


Vo 


yi + 


, (x - Xo)(x - Xl) • • • (x - Xn-l) 113-51 

(x„ - Xo) (x„ - Xl) • • • (x„ - x„_i) 


Example 3. The following data were obtained in the calibration of a 
platinum-rrhodium thermocouple. Find the temperature corresponding to 
a reading of 9.000 millivolts. 

t, °C. 630.5 960.5 1063.0 

e, millivolts 5.535 9.117 10.301 


With X = 9.000, Xo = 5.535, Xi = 9.117, X 2 = 10.301, yo = 630.5, yi = 
960.5, ?/2 = 1063.0, 


(-0.117) (-1.301) (630.5) (3.465) (-1.301) (960.5) 

(-3.472) (-4.766) (3..582)(- 1.184) 


(3.465)(-0.117)(1063.0) 
(4.766) (1.184) 


950.4°C. 


The value obtained from a carefully constructed curve is 950.2°C. 

13.4. Inverse Interpolation. — ^The problem of inverse interpolation, as 
the name implies, is that of finding a value of x corresponding to a given 
value oi y — f{x). From Lagrange’s formula, it is seen that the roles of x 
and y may be interchanged so that (5) may be used for inverse interpola- 
tion by rewriting it to give x = <t>{y). An illustration of this application 
of (5) is shown in the following problem. Inverse interpolation may also 
be effected by reversion of the series (3) or (4) to find as a function of y 
and A?/. The unknown x is then obtained from (2) or by a method of 
successive approximations. Full details of both procedures are given by 
Scarborough (loc . cit. ) . 

Problem. From the data of Example 3, sec. 13.3, find the electromotive force of 
the thermocouple when the temperature is 760®C. 


13.6. Two-way Interpolation. — Suppose the tabulated quantity is 
given as a function of two independent variables, for example, the index 
of refraction of water as it varies with both temperature and wavelength. 
Interpolation to give a value of y for two variables not contained in such 
tables is best performed by using Newton’s formula to interpolate for each 
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variable separately. Series, similar to Newton's for direct two-way inter- 
polation, are given by Scarborough (loc. cit.). 


NUMERICAL DIFFERENTIATION 


13.6. Differentiation Using Interpolation Formula. — In order to deter- 
mine the numerical derivative of a function of a: at a given point, the slope 
of the curve of the function may be obtained by graphical means or the 
data may be fitted to an empirical equation which is then differentiated. 
We may also write 


dx \du/\dx/ 


(13-6) 


and if we use (2) and (3) we get 
dy ir. , (2u - 1) 

dx hdu Al ^ 2! 




At the point x = xa;, = 0, so we have 


(3^2 - + 2) 

3! 




] 


(13-7) 


\ax/x^xk ^ 

“ ^^ 2 /* + H^*yk - ■ ■■] ( 13 - 8 ) 

\dx /x^x, fi 


More terms and higher order derivatives may be readily found. Since the 
lower order differences disappear upon differentiation, the convergence of 
(8) is slower than that of (3) or (4), therefore derivatives obtained in this 
way are not very precise. 

Maxima or minima in a tabulated function may be found by substitut- 
ing the differences in (7), equating the derivative to zero and solving for u 
and then for x from the relation x = x* + hu. 

Example 4. Find dy/dx and d^yfdx^ for y = at the point x = 0.05 
from the data of Table 2. 



0.05 L 
-0.09980 


0.00485 . 0.00019 0.000051 

.0.00745 + -j- + — 


-^1-0.00485 - 0.00019 + 
(0.05)^ L 

- 2.00000 


0.000551 

12 J 


The values found by differentiation are 

dy/dx = -2xy = -0.099750 
d^y/dx^ = 2y(2x^ - 1) = -1.985025 
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13.7. Differentiation Using a Polynomial. — Another method of finding 
the derivative has been described by Rutledge.® It does not depend on 
differences but assumes that the given data can be fitted to a polynomial of 
the fourth or lower degree. Five points must be known, that is, five values 
of X and y. If h is the equal interval between successive values of x, the 
derivative oi y = fix) at the point a: = Xt is given by the three following 
approximately equivalent expressions. 



= — [Si/i+i + lOyt — 18j/i_i + 6j/i_2 — yt-a] 
\lh 


1 

\2h 

1 

\2h 


[{yk-2 - 2/ft+2) - Kvk-i - yt+i)] 
[f/ifc+s — 6yi^.2 + 18?/fc+i lOyi 


(13-9) 


These equations are particularly suitable for solution by one continuous 
operation with a calculating machine. The method may be extended to 
apply to polynomials of degree higher than four or to derivatives of higher 
order. 

Example 6. Find dy/dx at x = 0.15 for y = e“*'‘ using the data of 
Table 2 and the method of this section. 

^ 1 [3 X 0.96079 + 9.7775 - 18 X 0.99005 + 

\dx/x-Q.i5 13 X 0.05 

6 X 0.99750 - 1] = - 0.02934 


= -^[(0.99750 - 0.93941) - 8(0.99005 - 0.96079)] 

0 . 6 '- 

= - 0.02933 

= J_ [0.91393 - 6 X 0.93941 + 18 X 0.96079 - 9.7775 - 

0.6 

3 X 0.99005] = - 0.02933 

By direct differentiation, dyjdx — —0.0293325. 

Problem. Use the data of Table 5 to find dyidx, and d%/dx^ at x » 0.75 by the 
methods of secs. 13.6 and 13.7. 


NUMERICAL INTEGRATION 

13,8. Introduction. — Suppose /(x) is known to be continuous over an 
interval of x from a to 6 but that either the explicit form of /(x) is unknown 
or it is such a function that its definite integral cannot be determined 

* Rutledge, G., Phys. Rev, 40, 262 (1932). 



457 


THE EULER-MACLAURIN FORMULA 


13.9 


conveniently in terms of other known functions. Numerical evaluation of 
such integrals, a process called approximate quadratureSy depends on replac- 
ing the integral J* f{x)dx by another integral J* <j){x)dx where <t>{x) can 

be determined in a simple way. If f{x) is known to have the (n -J- 1) 
values 2 / 0 ) 2/i) * * *) 2/n at (n + 1) points within the interval (a, 6), the latter 
integral may be expressed as 

J <t>{x)dx = -4o2/o + 4- • • * + AnVn (13-10) 

where the (n + 1) quantities Am are independent of the (n + 1) values 
of the 2/m- It follows that if f(x) is a polynomial of degree < n, the error 

made in replacing J f{x)dx by Ji^Amym niay be made to vanish by the 

proper choice of the Am* If /(^) is a polynomial of degree > n, the differ- 
ence between the true value of the integral and (10) may still be small 
enough to make this procedure useful. We first consider the methods 
where the 2/m are known at equal intervals. 

13.9. The Euler-Maclaurin Formula. — If the explicit form of J{x) is 
known and it has finite derivatives at the upper and lower limits of the 
integral or if these derivatives may be determined by numerical methods, 
the Euler-Maclaurin formula may be used to evaluate the integral. Indi- 
cating the values of /(a;) at a; = a and at x = 6 by 2/0 and yn and the inter- 
mediate values by 2 / 1 , 2 / 2 ) 2 / 3 ) • • this formula is written 

fmd. . 4 [I 1] - ^ to-'- - .r j 

(13-11) 

where yn^ and are the r-th derivatives of f{x) at the points h and a. 
The numerical coefficients Br are the Bernoulli numbers, defined by the 
relation 

X " BnX'^ 


^ (13-12) 

e^ — 1 n=o nl 

which may be rearranged to give the identity 

“ ” BnX^ ^ 

n^o(n + l)!^^o nl 

Successive values of Br are obtained from this equation by equating the 
coefficients of equal powers of x to zero or more simply as follows. Expand 
the equation 

(B -f- ir = (13-13) 
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for a given value of n. Now set B'' = 6, and if Bo = !> B„^ 2 y Bn-s, ■ ■ ■ 
are known Bn-i may be found. A few of the Bernoulli numbers^ are 
B\ = — Bz = Bs = Bt = • • • = 0] B 2 = Bi = — Bq => 

Bg = — Bio = Putting these numbers in (11) and using the 
notation (later to be clarified) 

It ~ h + 2/1 + 1/2 + • • • + Vn-i + 


we obtain the Euler-Maclaurin formula in expanded form 


I 


f{x)dx Iem It ^2 720 ^ 30240 


+ 






where 


1209600 

= [yV - y^ 


(13-14) 


TABLE 3 


X 

1 /x 

A 

A 2 

A » 


1.0 

1.000000 



1 


1.2 

0.833333 

-166667 




1.4 

0.714286 

-119047 

+47620 



1.6 

0.625000 

- 89286 

+29761 

-17859 


1.8 

0.555556 

- 69444 

+ 19842 

- 9919 

+7940 

2.0 

0.500000 

- 55556 

+ 13888 

- 5954 

+3965 


, Example 6. Divide the interval between 1.0 and 2.0 into five equal 

j r,2 0 

{dxjx) by the Euler-Maclaurin 

1.0 

formula. The required values of f{x) = 1/x are given in Table 3. We 
also need the derivatives of odd order which are 

Six) = l/x; f'^^x) = 

Then /(I) = -1; /'"(I) = -6; /''(I) = -120; /(2) = -0.25; 

/'"(2) = -0.375; /^(2) = -1.875. Since h = 0.2, we also find /iVl2 = 
0.003333; AV720 = 2.222 X 10"*; /iV30240 = 2.4 X 10“®; A = 0.75; 
A® = 5.625; A® = 118.1; It = 0.695635. Hence, 

Iem = 0.695635 - 0.002500 + 0.000012 - (2.8 X 10“'^) 

= 0.693147 


^ Those with even subscript only are required in the Euler-Maclaurin formula. 
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The fifth derivatives contribute to the final result only after six significant 
figures. A more exact value of the integral is In 2 = 0.69314719. 

13.10. Gregory’s Formula. — In case the explicit form of f{x) is un- 
known, we may rewrite (11), using (8) in place of the derivatives and obtain 
Gregory's formula 

J f{x)dx = lor + 2/1 + 2/2 + • + 

h h 

- ^ - A«/o) - — (AVn-2 + A^J/o) 

- ^ i^^yn-3 - A^t/o) - ~ (AVn-4 + A^J/o) - ■ • • 

(13-15) 

It should be observed that the contents of the parentheses are alternately 
differences and sums. Additional coefficients of -h{£Jyn-T ± A''j/o) may 
be found by evaluating the definite integral 

f z(z - l)(z - 2) • • • (2 - r)dz (13-16) 

(r + 1)! Jo 

Example 7. Evaluate the integral of Example 6 by means of Gregory’s 
formula. We find h/12 = 0.01667; k/24 = 0.008333; 19;!/720 = 0.0053; 
3 / 1 /I 6 O = 0.0038. Hence, 

lor = 0.69635 - 0.01667 (-0.055556 + 0.166667) 

- 0.008333(0.013888 + 0.047620) - 0.0053 (- 0.005954 + 0.017859) 

- 0.0038(0.003965 + 0.007940) = 0.693163 

The result is not as precise as that obtained in Example 6 because of the 
small number of available differences. 


Problem. Evaluate the integral of Example 8, sec. 13.11, by the Euler-Maclaurin 
and Gregory fonnulas. Divide the interval into five equal parts. 

13.11. The Newton-Cotes Formula. — Instead of using differences, it 
is possible to rewrite (11) or (15) in terms of the ?/„ .since the A'ijm may be 
reduced to sums of y„ by means of (1). The resulting equation, called the 
N€wton-Cote.s formula is of the form of (10) where 


!(n — m)\Jo {z — rn) 


Table 4 gives the Am for several values of n. The values found in this 
way may be easily checked since it is necessary that 

Ao + -f- A2 + A„ = nh 


(13-18) 
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When this method is used, it is simpler to divide the interval from a to 6 
into a number of sub-intervals. The number of yjs in each of these deter- 
mines the appropriate Am from Table 4. The value of the integral then 
equals the sum of the separate terms obtained by applying (10) to each sub- 
interval. We give a few special cases of the Newton-Cotes formula. 


TABLE 4 


n = 1 

II 

II 

o 

2 

.lo = vl2 = h/3; Ai = 4h/3 

3 

Aq — As — 3/i/8j Ai = A 2 — 9h/8 

4 

Ao = A4 = 14/1/45; Ai = yls = 64/i/45; A2 = Sh/15 

5 

Ao = A5 = 95/1/288; Ax = A4 = 125/i/96 

A2 = A3 = 125 / 1/144 

6 

Ao = Ae = 41/1/140; Ai = A5 = 54h/35 

A2 = A4 = 27/1/140; A3 = 204/1/105 


a. The Trapezoidal Rule. If each sub-interval contains two values of 

ymfU = 1,^0 = = /i/2and^ 

S ^ ^ + 2/1 + 2/2 + * • • + 2/n-l + (13-19) 

This result is exact if the first differences oi f{x) are constant. It will be 
noticed that (19) forms the principle term in both the Euler-Maclaurin 
and Gregory formulas. 

b. Simpsori’s Rule. If there are an even number of ym and we divide 
e0,ch sub-interval in two parts, we obtain, with n = 2 from Table 4, 
Simpson's One-Third Rule: 



g [yo + 4(t/i + 2/3 + • * • + 2/n-l) + 2(i/2 + 2/4 + • • * + 2/n-2) + 2/n] 

(13-20) 

This is exact if second differences of f{x) are constant. It is probably the 
most generally useful of all quadrature formulas. 

® In order to avoid confusion, it should be noted that n has been taken with two 
meanings. In Table 4, it refers to the number of intervals between the lower and upper 
limits of the integral. It now refers to the number of divisions of the sub-intervals. 
As a subscript in (19), (20) and (21) it indicates the last available value of as in 
previous equations. 
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c. Weddle's Rule. Taking n = 6 from Table 4 and neglecting all 
differences above the sixth, we obtain Weddle’s Rule: 

3h 

^ J fMdx “ ^ [ 2/0 + %i + 2/2 + %3 + 2/4 + 51/6 


+ 22/6 + 52/7 + 2/8 + 62/9 + 2/10 + 52/11 + 2^12 H 

+ 22/n~6 + 52/n-5 + 2/n-.4 + + 2/n-2 + 52/n_-i + 2/n] (13“21) 

This is the most accurate of the formulas^ in this section but it has the dis- 
advantage that the interval must be divided into a number of parts equal 
to six or some multiple of it. 

Various other special cases of the Newton-Cotes equation may be 
developed. The best known of these, generally called Simpson^s Three- 
Eighth's Rule is obtained from (10) and Table 4 with n = 3. As shown 
by Scarborough (loc. cit.), it is inferior to the One-Third Rule and should 
never be used. 


Example 8. Evaluate 


i 


16 ^3 


dx by the three preceding methods. 


^0 ^^-1 

This integral is of importance in the Debye theory of the heat capacity of 
solids;^® it cannot be evalua^d in terms of other known functions. Values 
of the integral between 0 and n, with u from 0.01 to 24 in steps of 0.01 have 
been given to six places by Beattie from his table, I = 0.615495. 
Dividing the interval 0 to 1.50 into six equal parts, we obtain Table 5. 
Since h = 0.25, we find 

It = 0.25 X (1.991643 + 0.484678) 

= 0.619082 

0 25 

/s = X (4 X 1.216979 + 2 X 0.774664 + 0.969357) 

o 

= 0.615550 
3 X 0 25 

Iw = - X (5 X 0.839293 + 1.744021 + 6 X 0.377686) 

= 0.615495 


^ Note that the last term in (21) has the coefficient unity if n = 6 or some multiple 
of 6. In deriving this formula, the coefficient of the term A^yo is 41/140, which is taken 
to be 3/10 in order to make the final form of the equation as simple as possible. The 
resulting error is negligible. 

See, for example, Mayer and Mayer, “ Statistical Mechanics," John Wiley and 
Sons, New York, 1940. 

Beattie, J, Math. Phys. 6, 1 (1926). 
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It is thus seen that the trapezoidal rule is the least accurate of these three 
equations while Weddle’s rule and Simpson’s rule give nearly the same 
results. 


TABLB 5 


X 

f(x) = a;V(e* - 1) 

0 

0 

0.25 

0.066013 

0.60 

0.192687 

0 76 

0.377686 

1.00 

0 581977 

1.25 

0.784280 

1.50 

0.969357 


Problem a. Compute some of the coefficients of Table 4. 

Problem b. Divide the interval of the integral of Table 5 into twelve equal parts 
and perform the integrations by the three methods of this section. 

13.12. Gauss’ Method. — The method of Gauss not only determines the 
(n + 1) values of Am but also fixes the (n + l)ym^s of (10) in such a way 

that the difference between J* <t>(x)dx and f f(x)dx is a minimum. Since 

there are now (2n + 2) constants available, it follows that if f{x) is a poly- 
nomial of degree < (2n + 1), the method will give an exact result for the 
integral. It will be remembered that the Newton-Cotes method will be 
exact under similar conditions, if the degree of the polynomial < n, hence 
Gauss’ method will give a more nearly exact result than the Newton-Cotes 
method with the same number of values of ym, or conversely the Newton- 
Cotes method requires a larger number of known values of the function 
than Gauss’ method for the same allowed error. This is a matter of some 
importance especially when the given values of ym are limited in number 
as they are likely to be when they result from experimental measurements. 
In applying the method, it is convenient to change the limits of the 

integral J f{x)dx by making the substitution 

X = a + (5 — a)v (13-22) 

hence in terms of the new variable v, the limits are 0 and 1. Then, 
fix) = f[a + (6 - a)v] = Fiv) 
dx = (b — a)dv (13-23) 


and 


/r? = Z' fix)dx ^ {b — a) j* Fiv)dv 


(13-24) 
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Developing Fiv) in a series similar to (10), we have 

Iq = RoFq + RiF 1 + R2F 2 + • • • + RnFn 


(13-25) 


where Fm means the numerical value of F{vm)- Now it can be shown*^ 
that the difference between J f{x)dx and I a of (25) is made a minimum 
provided and Rm are determined by the relations 

1; ^RmPm = 2 ! L ' 

m “O m *0 m =0 


fit “0 


(r + 1) 


(13-26) 


Since there are (2n + 2) constants to be evaluated, the most direct pro- 
cedure would be to solve simultaneously {2n + 2) equations like (26). 
This, however, is very laborious even for small values of n but the v^n alone 
may be found in the following way. Let Zo, 22 , ' ‘ Zn be the (n + 1) 
real roots of the Legendre polynomials^ Pn+i of degree (n 4- 1) obtained 
from the equation Pn+i ( 2 ) = 0. Then, 

«o = 5(1 + 2o), t'l = id + 21 ), • • •, = id + 2 „) (13-27) 

With the (n + 1) values of Vm determined in this way, it is a simple matter 
to find the remaining constants Rmy (n + 1) in number for it is only neces- 
sary to solve simultaneously (n + 1) relations like (26). Values of both 
Vm and Rm are given in Table 6.^^ 

Some writers make the substitution 

(a + b) (6 - a) 

which changes the limits of the integral to ±1. In this case, 

, (5 - a) . 6-a" „ , , 


) b — a " 

- I g(w)dw = — — X, T„g(w„) 

^ m«0 


where g(w) corresponds to the former F{v) and 

~ ^Rm \ '^m ~ 1 

The proof is given by Hobson, E. W., “ The Theory of Spherical and Ellipsoidal 
Harmonics,’* Cambridge Press, 1931. 

See sec. 3.3. 

More extensive lists may be found in Hobson, loc. cit. and in Project for the 
Computation of Numerical Tables,” Federal Works Agency, Work Projects Administra- 
tion, New York, in preparation. 
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It will be observed that in Gauss’ method, the interval is not subdivided 
equally as in the preceding cases but it is divided symmetrically about the 
mid-point. 

TABLE 6 


n = 2 

Vo = 0.11270166 
vi = 0.5 

V2 = 0.88729833 

Ro == R 2 — 

Ri ^ 

3 

Vo = 0.06943184 
vi = 0.33000948 

V2 - 0 66999052 

V3 = 0.93056816 

Ro = R3 = 0.17392742 
Ri ^ R2 = 0.32607258 

4 

VO = 0 04691008 
vi = 0.23076534 

V2 = 0.5 

V3 = 0.76923466 

V4 = 0.95308992 

Ro = Rt = 0 11846344 
= iJa = 0.23931434 
R 2 = 0.28444444 

6 

Vo = 0 03376524 
vi = 0 16939531 

V2 = 0.38069041 

V3 = 0.61930959 

V4 = 0.83060469 
v& = 0 96623476 

Ro = Rs = 0 08566225 
Ri = Ri = 0.18038079 
R2 = R3 = 0.23395697 


Example 9. Apply Gauss’ method to the integral of Examples 6 
and 7, subdividing the interval into four parts. From (22) and (44), we 

find X = 1 V and Iq = J* F(v)dv. From Table 6, with n = 3, we obtain 

Fo = 1/1.069432 = 0.935076; Fx = 1/1.330009 = 0.751875; F 2 = 
1/1.669990 = 0.598806; F3 = 1/1.930568 - 0.517982. Then, 

Iq = 0.173927 X (0.935076 + 0.517982) + 

0.326072 X (0.751875 + 0.598806) 

= 0.693145 


The result is as precise as that obtained by the Euler-Maclaurin or Gregory 
formulas but entails much less work. 


Problem a. Find values of Vm and Rm for n = 2. Hint: Pn+i iz) — ^^-3 0. 

Problem b. Evaluate the integral of Example 6, sec. 13.9 by Gauss’ method. 
Use the limits 1.0 and 3.0, subdividing this interval into four divisions. 


13.13. Remarks Concerning Quadrature Formulas. — The selection of 
the most suitable quadrature formula to use in a specific case is a matter for 
which no general rules can be given. When the explicit form of f{x) is 
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known and the differentiations easily made, the Euler-Maclaurin formula 
has the advantage of giving a result to any required number of figures. 
When the explicit form of f{x) is not known or if it cannot be differentiated 
easily, Gregory’s formula is useful. As previously stated, the Newton- 
Cotes formula and its special cases such as the trapezoidal rule, Simpson’s 
and Weddle’s rules are approximations to the Euler-Maclaurin and Gregory 
formulas; they have the advantage of requiring less labor to apply than 
the two former but result in a loss of accuracy. Gauss’ method is appar- 
ently not used as often as might be expected in chemical and physical 
calculations. Since calculating machines are commonly used in such work, 
the application of it is not laborious and the resulting precision should 
recommend it. 

The reader should remember that in approximate quadratures, the 
integrand is being replaced by a polynomial, the latter instead of the origi- 
nal function then being integrated. It thus follows that the reliability of 
the result is determined by the fidelity with which the approximating poly- 
nomial matches the given function. Since Gauss’ formula fits a poly- 
nomial of given degree with fewer known points than any of the other 
formulas, it should be preferred when the function is of such a form that it 
can be used. Even if the explicit form of f(x) is unknown. Gauss’ for- 
mula may still be applied but it requires interpolation between the given 
ym to find the proper F{v). When the ym are the results of experiment and 
can be arranged at will. Gauss’ formula in fact prescribes their optimum 
positions as those determined by the Vm- 

One caution regarding quadrature formulas should be mentioned. If 
the graph oi f{x) is such that the area under one portion of the integral is 
much larger than that under another portion, the integral should be 
evaluated separately for each area. The value of h for the sub-interval 
contributing the least amount to the final result may then be taken as a 
larger quantity than the A-valuc for the remaining sub-intervals. If 
nothing is known of the behavior of /(a;), a graph should always be drawn. 


NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 

13.14. Introduction. — One often encounters differential equations which 
cannot be solved by any of the methods described in Chapter 2, except 
that of solution in terms of a series, and this method may be difficult to 
apply in certain cases. Even when an analytical solution is available, it is 
sometimes not easy to find numerical values of corresponding pairs of the 
dependent and independent variables. For example, if the initial con- 
ditions xo = 0, t/o = 1 are given for dy/dx = (y — x)/(y + x), the solu- 
tions is ^ In (x^ + y^) + tan”^ y/x = 7r/2 but the labor of finding values 
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of X for given values of y will be very great. In cases of this kind, it is 
possible to proceed by graphical^® or numerical methods. The object of 
the latter is to obtain a table of x and y over the range of x required by 
the particular problem at hand. When a few such values are known, the 
table may be extended rather easily, as will be shown. Special methods 
are required for finding the first few values of x. We present four different 
ways of starting the solution of a differential equation by numerical meth- 
ods, and then show how the solution may be continued by extrapolation. 

13.16. The Taylor Series Method. — Suppose a differential equation 
of the first order is given : 

^ = SM (13-28) 

with initial values x = Xq, y = t/o- We may then write the Taylor series 
y-y. + (.x- 

+ • • • + yH' (13-29) 

n\ 


If it is possible to find the various derivatives, the calculations may be 
extended to as many values of x as desired. 

Example 10. Start the solution of the differential equation 



(13-30) 


with initial conditions, Xo = 0, i/o = 0. The exact solution of (30) Ls 
found by the methods of Chapter 2 to be i/ = xe~^\ the reader will recog- 
nize that it is of the form of the differential equation occurring in the study 
of radioactive disintegration and in the kinetics of chemical reactions 
involving consecutive first order decompositions. Since y' = — i/, 

2/" = — y\ ' ’ •, it follows that yo^ = 

(_l)n-i^and from (29), 


. x^ x^ X® X® 


For graphical methods, see Levy, H., and Baggott, E. A., Numerical Studies 
in Differential Equations,’^ Vol. 1, Watts and Co., London, 1934, or Sherwood and 
Reed, Applied Mathematics in Chemical Engineering,^* McGraw-Hill Book Co., 
New York, 1939. 
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Taking x as 0.1, 0.2 and 0.3, we find the results which appear in Table 7. 

TABLE 7 


X 

y 

Exact 
Values of y 

0 

0 

0 

0.1 

0.0905 

0 09048 

0.2 

0.1637 

0 16375 

0.3 

0.2222 

0.22224 


While the method is very simple, it is often tedious to apply as the 
successive derivatives may become difficult to handle and even at a; = 0.3 
in this case, the fifth derivative is needed. However, it would appear that 
this procedure is preferable to any other in finding the first few values of y 
when it is possible to use it. 

13.16. The Method of Picard (Successive Approximations or Itera- 
tion). — From (28), we see that a solution may be found in the form of an 
integral equation 

y = 2/0 + f fix,y)dx = yo + f (13-31) 

An approximate solution of this equation may be made by assuming that 
y — yo under the integral sign. The integral may then be evaluated (by 
quadratures, if necessary) since it is only a function of x and the constant 
I/O. Denoting this first approximation to y by ^y, 

V = 2/0 + r f{x,yo)dx (13-32) 

The process may be repeated to give 


^2/ = 2/0 + r fWy)dx (13-33) 

^ XQ 

and so on. 

Example 11. Start the solution of Example 10, sec. 13.15 by this 
method. 


X 


2/1 = ^0 + / {e ^ - yo)dx 


-X 


e^dx = 1 — 6 "" 
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*J/i = 1/0 + /*(€*- ^y)dx 

= J (2e-® - l)dx = 2(1 - e"*) - a: 

V = 3(1 - e-^) 

With X = 0.1, ^2/1 = 0.0906 ; the next approximation, ^yi is the same as 
^yiy hence we proceed to calculate ?/2 at x = 0.2 from the relations 


V 2 

= hi 

+ J 


- hi)dx 




0.1 



-hi 

+ e 

+ O.l^yi - e 


= 1.0045 - 

- 6 ^ - 

- 0.0906X 

^2/2 

= hi 

+j 

r<.- 

0.1 

— 

CO 

-hi 

+ 1 

fce-* 

- h2)dx 


The next value, ^1/2 is the same as ^^2 so we go on in the same way to find 
^ysj etc., at X = 0.3. The results by the Picard method are seen from 
Table 7 to be not quite as good as those obtained in Example 10. Moreover, 
the disadvantages here are similar to those of the method of sec. 13.15, for 
the successive integrals may become more and more difficult to determine. 

13.17. The Modified Euler Method. — If the intervals between succes- 
sive values of x are small enough we may write Ax = h and 

Ay = Ax (13-34) 

An approximate value of yi at xi = xo + /^ is then given by 

Vi = 1/0 + Ay = I/O + h (13-35) 

An approximation to dy/dx at Xi, may be obtained by the relation 

'(f )i " 

which leads to an improved value of yi 
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This in turn will give a better approximation to dy/dx 

which may be used to compute the third approximation to yi. The process 
is repeated until there is no further change in the results. The values of y 
and dy/dx at X 2 are found in a similar way. 

Example 12. Start the solution of (30) by this method. Since 
= 2/0 = 0, (dy/dx)o = 1. With h = 0.1, 

V = 1 X 0.1 = 0.1 

^idy/dx)i = ‘ - 0.1 = 0.8048 

V = 0.1(1.0 + 0.8048)/2 = 0.0902 

^ (dy/dx) I = 0.9048 - 0.0902 = 0.8146 

V = 0.1 (1.0 + 0.8146) /2 = 0.0907 
^(dy/dx)i = 0.9048 - 0.0907 = 0.8141 
Sji = 0.05(1.0 + 0.8141) = 0.0907 

No further improvement results by continuing the approximations, so we 
proceed to a: = 0.2 with yi = 0.0907, (dy/dx)i = 0.8141. Then, 

V2 = 0.0907 + 0.8141 X 0.1 = 0.1721 

Hdy/dx)2 = ^ - 0.1721 = 0.6466 

and finally, 

*y 2 = 0.1641 , ^ (dy/dx) 2 = 0.6546 

This method is tedious in application but perhaps less complicated than 
either of the preceding methods since neither differentiation nor integra- 
tion is required. 

13.18. The Runge-Kutta Method. — In this method it is necessary to 
calculate the four quantities 

= f(xo,yo)h 

k2 = f{xo + 2 ' "2) ^ 

^3 = / {^0 ^2’ ^ ^ (13-39) 

h = f(xo + h, yo + k3)h 


Then, 


Xi = Xo + h, 2/1 = 2/0 + Ay 
A2/ = i(A:i + 2*2 + 2*3 -f- *4) 


(13-40) 
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It will be noted that if /(a:,t/) is independent of y, (40) reduces to Simpson's 
rule. The same formulas are used to compute y at X 2 , substituting xi and 
yx for xo and y^ in (39). 

Example 13. With the same differential equation as before (Example 
10, sec. 13.15), we find 

kx = ( 6 "**° — y^)h = 0.1 
*2 = ~ 0.05)0.1 = 0.0901 

^3 = - 0.0450)0.1 = 0.0906 

/C4 = (6"^*^ - 0.0906)0.1 = 0.0814 

Hence, 

= 2/0 + A2/ = -^(0.1 + 0.1802 + 0.1812 + 0.0814) = 0.0905 

For the next interval, we find in a similar way, kx = 0.0814, k 2 = 0.0730, 
/C3 = 0.0734, /C4 = 0.0655, At/ = 0.0733, t/2 = 0.09055 + 0.0733 = 0.1638 . 

The error in the Runge-Kutta method is of the order of h^. It will be 
seen that its use is reasonably simple; it is probably the most generally 
useful of the four methods given here. 

13.19. Continuing the Solution. — ^When the first few values (three or 
four) of y have been found by one of the preceding methods, the solution 
may be continued by extrapolation. For this purpose, it is appropriate 
to use Newton's interpolation formula (4), rewriting it in terms of 
y' = dy/dx, yi and the differences ^yi^xy ^Wk^ 2 y * * Upon 

substituting this expression in the equation 

y'dx (13-41) 

and performing the integration, several useful formulas may be obtained 
by changing the limits of the definite integral. 

(A2/)ik“^^ = + +i^A^i/fc»2 + f A^y;^_3+yf^AVA-4) (13-42) 

(Ay)*-! = h{yk — AA^?/j[_3 — 7^AV(_4) (13-43) 

(Ay)jfcl2 = ^(yifc ~ +3^A^2/Jfc~2 + ”5V^^y*-3 + v^:fV^Vife-4) (13-44) 

(As/)*!! = ^(yi f A 2 /A -.1 + f-|A^2/lk-2 ~ l-A^^/lfc-s “■ (13-45) 

(Ay)*!! = ^(y*““i-Ay*-i ■f-f-i-A^t/*«2~'|^A^y*_3 + ^iJ-A^yj^.4) (13-46) 

The meaning of a symbol such as (Ay)*"*"^ should be clear. It is the incre- 
ment to be added to the A;-th value of y in the difference table to obtain the 
next value beyond, that is, the value of y at Xk^x, Equation (42) is thus 
to be used for extending the table to larger values of x while the remaining 
formulas are useful in checking the values of y already found. 
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Example 14. Extend the integration of the differential equation of 
Examples 10, 11, 12 and 13, using the values of y' found in Example 12. 
We first collect the data as shown in Table 8. To check y at a; = 0.1, 
let us use (44). Then since k = Z, 

(Ay)h = 0.1(0.6546 + f0.1595 + ^0.0264) = 0.0905 

Thus, yi — yo + i^y)h = 0.0905, which shows that the result in Table 8 
is in error by 2 units in the last place. Similarly, to check y 2 , we use (43) 
to obtain 

(Ay)? = 0.1(0.6546 + 0.0798 - 0.0022) = 0.0732 

and 

y 2 = yi + (Ay)? = 0.0905 + 0.0732 = 0.1637 

We now make a new table (Table 9) to include our corrected values of 
y, y', Ay', etc. To find ya, we use (42) to obtain 

(Ay)| = 0.1(0.6550 - 0.0796 + 0.0110) = 0.0586 
ya = 0.1637 + 0.0586 = 0.2223 
A check on ya may be found from (43) 

(Ay)i = 0.1(0.5185 + 0.0682 - 0.0011 + 0.0005) = 0.0586 


TABLE 8 


X 

y 

y 

Ay' 

<1 

0 

0 

1.0000 



0.1 

0.0907 

0.8141 

- 0.1859 


0.2 

0.1641 

1 

0 6546 

- 0.1595 

+ 0.0264 


TABLE 9 


X 

y 

^y 

y 

Ay' 

<1 

A*J/' 

0.00 

0 


1.0000 




0.10 

0.0905 

0.0905 

0.8143 

- 0.1857 



0.20 

0.1637 

0.0732 

0.6550 

- 0.1593 

+ 0.0264 


0.30 

0.2223 

0.0586 

0.5185 

- 0.1365 

+ 0.0128 

- 0.0136 


Since this is the same result as th^t found previously, we proceed to the 
next value of x. Moreover, since the preceding y was correct at the first 
trial, we suspect that the value of h might be increased, say to 0.20. We 
thus obtain y for x = 0.40 in the same manner as before, then rewrite the 
table for X = 0, 0.20 and 0.40. From the new table, we go on to x = 0.60, 
etc. 
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13.20. Milne’s Method. — One further method of continuing the solu- 
tion of a differential equation is often useful. Supposing the first four 
values of y and y' have been found by some of the previous methods, we 
continue as follows. 

1. Find a first approximation to the next y by using the formula 

Ah 

V. = + y (2yLx - yU + 2t/U) (13-47) 

2. Substitute this in the original differential equation (28) to find y*. 

3. Use the value of y^ to calculate ^yu from the relation 

^yk = yk-2 + 2 (^* ^ + yk^2) ( 13 ^ 8 ) 

If ^yk and ^yk agree to the desired number of figures, we may proceed to 
the next interval in the same way. If they do not agree, the size of the 
interval must be decreased. The error due to the use of (48) is 

~ isV 1 ^y^ ~ I 

Eqs. (47) and (48) are obtained by integrating Newton^s interpolation 
formula (3), after expressing it in terms of Both formulas are exact 
when fourth differences of y' vanish. 

Example 16. Use Milne’s method to continue the solution of the 
differential equation of the previous examples. For x = 0.4, we find using 
Table 9 and (47), 

ly^ = ^ (2 X 0.5185 - 0.6550 + 2 X 0.8143) = 0.2681 
3 

From the original differential equation (30) 

yi = (0.6703 ~ 0.2681) = 0.4022 

From (48), 

= 0.1637 + ^ (0.4022 + 4 X 0.5185 + 0.6550) = 0.2681 

o ■ 


Problem. Use the various methods of this chapter to obtain the solutions, cor- 
rect to four decimal places, of the differential equation dy/dx = {x — y) between 0 
and 0.25, with xo = 0, yo == 1- The exact solution is y ==» (x - 1) + 


13.21. Simultaneous Differential Equations of the First Order. — Sup- 
pose the given equations are 


dx 




j- = f2(x,y,z) 

dx 


(13-19) 
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where x is the independent variable and 2 /, z are dependent variables. 
Provided initial values of x, y and z are given, the first increments in y and z 
due to an increment Axi in x may be found by any of the methods given in 
the preceding sections. The procedure should be obvious, but it is particu- 
larly necessary to check the results carefully at each stage of the solution. 
If the Runge-Kutta method is used, the following equations replace (39) 
and (40) 

*1 = /i(a:o,2/o,2o)/i 

1 - r ( 1 ^ L 

^2 — h ( ^0 + 2> 2/0 + 2^0 + — 1 h 

, ! h ko m<\ 

*3 = h (^^0 + -, 2/0 + ^0 + Y ) h 

k 4 r == /i(to + hj 2/0 + A:3, 20 + m3)h 

mi = f 2 (xo,yo,zo)h 

m 2 = /2 ^^0 + 2» ^0 + Y ^ ^ (13-50) 

ms = /2I Xo + -, 2/0 + Y^ ^ + Y j ^ 

^4 = f2(^o + h, 2/0 + ks, zo + ms)h 

xi = Xo + h; 2/1=2/!) + ^2/; = 2o + A2 

^ 2 / = ^(^1 + ^^2 + ^ks + ^ 4 ) 

Az = -^(mi + 2m2 + 2ms + ^ 4 ) (13-51) 

13.22. Differential Equations of Second or Higher Order. — Any differ- 
ential equation of second or higher order is reduced to a system of simul- 
taneous equations by the introduction of new variables. Consider the 
equation 




7 

II 

(13-52) 

where = 

= dy/dx, y' 

' = d^yldx^y etc. Make the substitutions 




dy dzi _ dZn-2 

(13-53) 


2 l = 

dx dx dx 

then, 





d”y 

dx’' 

dZn-^l ^ 

- = /(x,2/,2i,32,* • 

(13-54) 


Provided initial values of x, 2 /, ^i, 2 : 2 , • • •, 2n-i are given, the problem is 
equivalent to the solution of a system of simultaneous first order differential 
equations which may be effected as described in sec. 13.21. 
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In physical problems, differential equations of the type 

d% 



often arise, with the requirement that the variables satisfy certain bound- 
ary conditions, say x — y — and x = X\y y = y\ with the initial 
value of dy/dx unknown. For example, in the Thomas-Fermi theory of 
the atom, the equation d^yjdx^ = (y^/x)^^^ occurs with the boundary con- 
ditions, X = 0, y = 1, a: = 00 ^ 2 / = 0. In cases of this kind, a tentative 
value of dyjdx is assumed and a rough integration is made over the range 
of X. This first approximation will usually suggest a better guess for 
the initial value of dy/dx. After several attempts are made, the value of 
dy/dx may usually be found to the desired accuracy.^® 

Example 16. Find y and dy/dx for the equation 


d^y 

dx^ 


dx 


+ 4x 3 ^ - 42/ = 0 


Let dy/dx = z, then the second order equation is equivalent to the first 
order equations 


dx 


= 2; 


dz 

dx 


+ Axz — 42/ = 0 


which may be solved by the previous methods. If the Runge-Kutta 
method is used, fiix^y^z) = z and f 2 {XjyjZ) = —4x2 + 42/. In this case, 
/i does not depend on x and 2 /, a situation which makes the evaluation of 
the in (50) somewhat simpler than in the general case. The differential 
equation of this problem may be solved exactly by the substitution 

—a:* 

y ^ ve ^ . 

PART 2. ALGEBRAIC CALCULATIONS 


13.23. Numerical Solution of Transcendental Equations. — No general 
method exists for finding the roots of transcendental equations such as 
xe^ = 1 or x^ = sin X. Approximate values may always be found by 
graphical means; where more precise results are required several analytical 
procedures are available. 

& The Method of ‘‘ Regula Falsi/^ Suppose the given equation is 
/(x) = 0, then it is obvious that the plot of y = /(x) will give the required 

For more details of these cases, see Levy and Baggott, loc. cit. The Thomas- 
Fermi differential equation is discussed by Baker, Phys. Rev 36, 632 (1930). 
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root when y — 0, that is when the graph crosses the .x-axis. Two values 
of X, say xq and x\ with the corresponding values of y are selected from 
a graph or otherwise. Then if Xq is near the root desired, a better approxi- 
mation for the root is given by 


where 


X = Xq -f Ax 


= (^1 - ^ o )l Vq 1 
Uo ! + Ui 1 


(13-55) 


The process is continued until the required number of figures is obtained. 

Example 17. Find the solution'^ of /(x) = (5 — x)e^ — 5 = 0 near 
X = 5. One solution is clearl^^ x = 0; to find the other let Xo = 4.5, 
xi = 5.0; yo = 40.00, y\ = —5.00, hence 

, 0.5 X 40.00 

^Ax = = 0.44; ‘x = 4.50 + 0.44 = 4.94 

45.00 


A second approximation with Xo = 4.94; — 3.382 gives 

o 0.06 X 3.382 

2ax = = 0.024; = 4.94 + 0.024 = 4.964 

8.382 


The third approximation with Xo = 4.964; i/o = 0.1516 gives 


^Ax 


0.036 X 0.1516 
5.1516 


O.OOl; V = 4.964 + 0.001 = 


4.965 


Further repetition of the calculations show that this result is correct to four 
significant figures. The value 4.965114 has been obtained by Birge.'® 


b. The N eivton-Raphson Method. When the derivative of /(x) is easily 
evaluated numerically, the real roots of /(x) = 0 may be determined in the 
following way. Suppose Xq is an approximate value of one of the roots, 
then an improved value of the root is given by 


X = Xo + Ax; 


Ax = — 


/(Xq) 

f{xo) 


(13-56) 


The next approximation is found by substituting x in place of Xq to get a 
new value of Ax, continuing in this way as long as necessary. In practice, 
it will be found that after a few approximations, the value of the derivative 
will change very little with succeeding values of x hence /' need not be 
recomputed. 

This equation occurs in the evaluation of Wien's displacement law from the 
Planck radiation formula, see, for example, Richtmyer, F. K., “ Introduction to Modem 
Ph 3 rsics,” First Edition, McGraw-Hill Book Co., New York, 1928, p. 242. 

^^Birge, R. T., Rev. Mod. Phys., 13, 233 (1941). 
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Example 18. Find x of example 17, starting with Xq = 4.9. Sub- 
stitution gives S{xq) = 8.43; fix^) = —120.87; Ax = 8.43/120.87 = 
0.07; ^x — 4.97. The second approximation is obtained from /(4.97) = 
-0.677; /'(4.97) = -139.78; Ax = -0.677/139.78 = -0.005; ^x - 
4.965 . 

c. The Method of Iteration. If we rewrite our equation /(x) = 0 in the 
form 

X = <#>(x) (13-57) 

we may substitute an approximate value of x, say Xq on the right of (57) 
to get ^x F= <^)(xo) and repeat to get 

^x = 0 (xi); ^x = 0 (x 2 ); etc. (13-58) 

It is often possible to write /(x) = 0 in the form x = 0(x) in several 
different ways, in which case, it is better to start with the simplest such 
arrangement. A few approximations will indicate whether the chosen 
form is suitable but if the succeeding values of x do not converge rapidly, 
one of the alternative functions should be tried. 

Example 19. Find x of the function in Examples 17 and 18 by the 
method of iteration. Writing the equation in the form 

b{e^ - 1) 


we find with Xo = 4.9; = 134.3; ^x = (5 X 133.3)/134.3 = 4.963. 

The next approximation gives e^ — 143.1; ^x = (5 X 142.1)/143.1 = 
4.965 . 

. Problem. Solve the equation x log x = 1.5334 by the methods of this section. 
.4ns..- X = 3.1110. 


13.24. Simultaneous Equations in Several Unknowns. — The real roots 
of simultaneous algebraic or transcendental equations may be found by the 
methods of secs. 13.23b or 13.23c. In the Newton-Raphson method, 
when two equations are given 


(56) is replaced by 
where 


f(x,y) = 0; g{x,y) = 0 

X = Xo + ^; y = yo + ^y 

-f(3^o,yo) fv(xo,yo) 
-g{^,yi)) gy{xo,yo) 
fx{xo,yo) -fixo,yo) 
gxixo,yo) -gixo,yo) 
fx(xo,yo) fuixQ,yo) 
gx{xo,yo) gv{xo,yo) 


1 

Ax == — 
A 



A = 


(13-59) 
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In the method of iteration, we rewrite (59) as 
X = <t>{x,y)\ y = yl/{x,y) 

then 

^x = </)(xo, 2 /o); V = 2/0) 

^x = <t>{^x,^y)] ^y = yp{%^y); etc. (13-60) 

Both methods are readily extended to cases of more than two unknowns. 

13.25. Numerical Determination of the Roots of Polynomials. — Any 
of the methods of sec. 13.23 may be applied to determine the real roots of a 
polynomial. When all of the roots are not required, the Newton-Baphson 
method is probably more rapid than the others. In order to evaluate 
f{x) and f{x) for x = Xq, the following procedure will be found useful. 
Suppose the polynomial is y(x) = co:r^ + cix^^^ + • • • + Cn. Write the 
coefficients in a line, supplying zeros if any powers of x are missing. Multi- 
ply the number cq by Xq and add the result to Ci; multiply this sum (di) 
by xo and add to C2 continuing until the last sum is obtained; its value 
equals y{x) for x = Xq. The scheme is illustrated in Table 10. In actual 
computation with a calculating machine nothing need be written down 
since with prosier care to locate the decimal point and due regard to sign, 
the whole process may be performed as a continuous operation. The use 
of this method is illustrated further in the last part of Example 20. 


TABLE 10 


Cl 

C2 

C3 

• . . 

Cn 

CqXq 

dixo 

d2Xo 

. . . 

dn-lXo 

di 

“dT 

ds 


dn 


Graeffe’s root-squaring method will be found to involve little more labor 
than the preceding method with the added advantage that it gives all of 
the roots of the polynomial at once. No initial approximation is required 
and complex as well as real roots may be found. It is convenient to divide 
by the coefficient of if necessary so that the polynomial appears in the 
form y{x) = + aix^~^ + a2X^~^ + • • • + an = 0. Using detached co- 

efficients, Table 11 is calculated. Care must be taken with the signs of the 
doubled cross-products. The new coefficients 61, 62, * ■ v are then squared 
and the cross-products of the b's determined in a similar way. As the 
squaring process is continued, it will be found that the doubled cross- 
products become progressively smaller, eventually contributing nothing to 
the next squared terms. When this point is reached, there will be n coeffi- 

Horner’s method does not appear to have any advantages over the Newton- 
Raphson method. It is described by Mellor, J. W., “ Higher Mathematics for Students 
of Chemistry and Physics,” Longmans, Green and Co., New York, 1902, and in most 
elementary algebra texts. 
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cients, say mi, m2, • • •, rrin- 
polynomial 

1 Xi I** = mi; I T2 


Then if Xi, X2, • • *, Xn are the n real roots of the 


p = 


m2 

mi 


y y 


P = 


Mn 

rrin^i 


or, 


log 1 Xi 1 
log I X 2 1 

log 1 X3 1 


- log mi 
V 

- (log m.2 - log mi) 


V 

- (log m3 - log m2) 

V 


log Xn = - (log nin - log m„_i) 
V 


(13-61) 


where p = 2* and s is the number of times the squkring operation has been 
performed. The signs of the roots must be determined by some rule of 
signs but this may often be done by inspection. 


TABLE 11 


1 

ai 

a2 

as 

04 


1 

ai 

'-2a2 

oi 

— 2aia3 

+204 

ai 

— 2a2ai 
-|- 2 a 105 
— 206 

04 

— 2 O 3 O 5 -f- 20206 
— 2 oi 07 
+ 2 o 8 


1 

h 

hi 

bs 

hi 



In practice, it is best to carry only four or five figures in the calculations, 
hence tables of squares and four-place logarithms may be used if a calcu- 
lating machine is unavailable. If more figures are required in the roots, 
the use of the Newton-Raphson method serves both to give these addi- 
tional figures and io check the previous calculations. 

When two (or more) roots of the polynomial are real and equal, one of 
the doubled cross-products will not decrease in magnitude as the squaring 
proceeds; in fact it will always be equal to one-half of the squared term 
which stands just above it. The squaring in this case is stopped when 
the other cross-products no longer contribute to the next coefficients. 

The presence of complex roots in a polynomial expression is revealed 
by the fact that the doubled cross-products do not disappear and the signs 
of some of the sums alternate as the squaring proceeds. The method of 
finding the complex roots as well as paii-s of real roots is described in detail 
by both Scarborough and by Whittaker and Robinson (loc. cit.). 
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TABLE 12 


1 

-5.600 X 10 

4.900 X 10® 

1.111 X 10* 

-1.175 X 10® 


3.136 X 10® 

-0 980 

2.401 X 10® 
12.44 
-2 350 

1.234 X 10® 

I 152 

1 381 X 10*" 

1 

2 156 X 10® 

1 2.50 X 10® 

2 386 X 10® 

1 381 X 10*" 


4 648 X 10® 

-2 .500 

1 562 X 10^2 
-1 029 

0 028 

5 693 X 10*® 

-3 452 


1 

2 148 X 10® 

5 610 X 10 " 

2 241 X 10*® 

1 907 X 10®" 


4 614 X 10*® 

-1 122 

3 147 X 10"'^ 

-0 963 

0 004 

.5 022 X 10®® 

-2 140 


1 

3 492 X 10*® 

2 188 X 10®® 

2 882 X 10®® 

3 6.37 X 10*" 


1 219 X 10®® 

-0 044 

4 787 X 10*® 

-0 201 

8 306 X 10®* 

-1 591 


1 

1 175 X 10®® 

4 .586 X 10*® 

6 715 X 10®* 

1 . 323 X 10®* 


1 ;581 X 10®" 

2 103 X 10"® 

4 388 X 10*®" 

1 750 X 10*®® 

1 

1 904 X 10*"" 

4 414 X 10*®® 

1 925 X 10®®" 

3.062 X 10®®* 


Example 20. Find the four real roots of the polynomiaP^ 
yix) = X* - 5Gx^ + 490x2 ^ ll,112x - 117,495 = 0 


The method is apparent from Table 12. It will be seen that the second row 
of doubled cross-products may be neglected after the eighth power terms 
and the first row after the thirty-second power terms, hence the squaring 
is stopped after raising the coefficients to the sixty-fourth power. We then 
find that 


log 1 xi 1 = 100.2797/()4 = 1.5669 

log 1 xo 1 = (186.6448 - 100.2797)/64 = 1.3494 

log 1 X3 1 = (72.6396)/64 = 1.1350 

log 1 X4 1 = (65.2016)764 = 1.0188 

so that 1 xi 1 = 36.89; 1 xj 1 = 22.36; 1 X3 ] = 13.65; | X4 1 = 10.45. 

Inspection shows that all signs are positive except that of X3. With these 


Solution of similar equations is needed to calculate the energy levels of the asym- 
metric top in quantum mechanics, see, for example, Dennison, D. M., Rev. Mod. Phya., 
3, 280 (1931). 
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values, the sum of the roots is 56.05, in approximate agreement with the 
coefficient of in the original equation. 

In order to improve these values, we make use of the Newton- Raphson 
method. With Xi = 36.89, we find 

y{xi) = [(1 X 36.89) ~ 56] + [(-19.11 X 36.89) + 490] 

+ [(-214.97 X 36.89) + 11,112] + [(3,181.76 X 36.89) 

- 117,495] = -120 


In the same way, from 

y'{x) = 4x^ - 168x2 + 980x + 11,112 


we find 


Then, 

and 


y\xi) = [(4 X 36.89) - 168] + [- (20.44 X 36.89) + 980] 
+ [(225.97 X 36.89) + 11,112] = 19,448 

Axi = 120/19,448 = 0.0062 

hi = 36.89 + 0.0062 = 36.8962 


Repeating the calculations, we obtain yCxi) = 3.57; y'Cxi) = 19,478; 
A^xi = —0.0002; hi = 36.8960 . This value is correct to five significant 
figures. The same procedure applied to the other roots gives 22.3410 ; 
— 13.6669 ; 10.4302 . The sum of these values which is 56.0005 gives a 
further check on the results. 


* Problem. Find the roots of — 15x^ + 74x — 120 = 0, by the Graeffe method. 
Ans.: X = 4, 5, 6. 

13.26. Numerical Solution of Simultaneous Linear Equations. — 
Systems of the form 

n 

E auiXk = ff.-; {i = 1, 2, • • •, n) (13-62) 

k = l 

where the aki and Qi are numbers and the Xk are sought, often occur in 
physical problems, particularly in the solution of the normal equations 
resulting from a least squares treatment of numerical data (see sec. 13.37b). 
Several methods of solving such equations are given by Whittaker and 
Robinson (loc. cit.) but none of these are particularly suitable for machine 
calculation (see also sec. 10.9). When aki = (^ik, which is usually true, 
the determinantal method described there offers certain advantages but in 
general when the number of unknowns is greater than four or five the 
labor of evaluating the determinants becomes prohibitive. The following 
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systematic procedure^^ which is well adapted for machine calculation will 
be found useful in such cases. 

Using detached coefficients, the numbers in (62) are written down as in 
Table 13. For convenience we assume that there are only four unknowns; 


TABLE 13 


01 

02 

Oz 

- 

ail 

ai2 

^13 

— 022/021 

—023/021 

3fC 

«21 

^22 

023 

1 

0 

031 

fl32 

O33 

0 

1 


92 

9z 


- 


bn 

hi2 

— ^>22/^21 



h* 

021 

^22 

1 




n 

Oz 

- 




^11 




extension of the method to a larger number may be made without difficulty. 
Choose some unknown, say X2 for elimination. Divide the numbers of the 
corresponding row of (A) by the first number in that row (we indicate it 
with a star) and add one’s and zero’s as shown to form (S). Now con- 
sider gi, g2j {/3 as a row matrix and multiply the columns of (B) by this 
row (see sec. 10.6). The results arc g2 and g'^. For example, g2 = gi X 
(~a22/a2i) + 02 X I + Qs X 0 and g^ = gi X {-a2^/a2i) + g2 X 0 + g^ 
X 1. Multiply rows of (A) by columns of (i^), omitting the starred row 
of (A). This gives the numbers bij. Again star an element and repeat 
the process until the last unknown is eliminated. The values of x are then 
given by 

*1 = 93 /cn 

X2 = (92 - bnxi)/h2i 

X3 = (9i — — «3ia^3)/a2i (13-63) 

Some care must be exercised in the order of elimination of the x’s, especially 
if they are of widely different magnitudes. It is always advisable to begin 
with the smallest one, proceeding with the elimination in order of increasing 
magnitude. If this is not done, the cumulative errors in the calculations 
will produce unsatisfactory values of the unknowns. 

More details are given by Frazer, Duncan and Collar, “ Elementary Matrices,^^ 
Cambridge Press, 1938. A similar method which does not involve matrix multiplication 
is described by Runge and Kdnig, loc. cit. 



13.27 


EVALUATION OP DETERMINANTS 


482 


Example 21. Fit the data of Example 3, sec. 13.3 to an equation of the 
form e = xi + a; 2 < + x^t^. The three simultaneous equations become 

X, + 630.5X2 + 3.976 X 10®X3 = 5.535 
xi 4- 960.5x2 + 9.226 X 10®X3 = 9.117 
xi + 1063.0X2 + 11.300 X 10®X3 = 10.301 (13-64) 


TABLE 14 


6.536 

9.117 

10.301 

- 

- 

1 

630.5 

3.975 X 10®* 

1 

960.5 

9.226 X 10® 

1 

1063.0 

11.300 X 10® 

-2.32101 

1 

0 

-2.84277 

0 

1 


-3.72979 

-5 43373 

- 

- 


-1.32101 

-502.897* 

-1 84277 
-729.366 

-1.45033 

1 




-0.02430 

- 




-fO. 07313* 




Since the magnitude of the x^s is probably xz < X 2 < Xiy we choose the 
starred numbers in that order. If we desire four significant figures in the 
final results, we note that we must carry six figures in the calculations, 




A M W y ^ V/ C4/ X 


Table 14. Then, 


xi = -0.02430/0.07313 = - 0.3323 

X2 = -(-3.730 - 1.321 X 0.3323)7502.9 = 0.00829 

X3 = (5.535 + 0.3323 - 630.6 X 0.00829)/3.975 X 10* 

= 1.611 X 10~® 

Substitution of these results in the original equations gives as a checK, 
5.535, 9.116, 10.300. 

13.27. Evaluation of Deter min a n ts. — ^^I'ho procedure just outlined is 
also applicable to the evaluation of determinants, the scheme being similar 
to that shown in Table 14 except for the fact that the g’s are omitted. If 
the starred elements are taken in the first column and row, that is, in the 
order On, ftn, cn, etc., the value of the determinant equals the product of 
all of these starred elements. If some other order is chdsen as in Example 
21, the determinant stUl equals this product but it must be multiplied by 
( — 1)" where n is the number of interchanges required to bring the starred 
elements into the position of the element which stands first in the corre- 
sponding array. If it is convenient to choose starred elements that are not 
in the first column the necessary modification of the procedure will be found 
described by Frazer, Duncan and Collar (loc. cit.). 
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Example 22. Evaluate the determinant of the coefficients of the 
in Example 21. Since two interchanges are required to bring the first 
starred element to the position a\\ and one interchange to bring 621 to bn, 
the value of the determinant A is given by 

A = (~1)3 X (3.975 X 10®) X (-502.897) X (0.07313) 

= 1.4619 X 10^ 

Problem. Evaluate some of the determinants of Example 23, sec. 13.28. The 
answers are found in Table 16. 

13.28. Solution of Secular Determinants. — In many quantum mechani- 
cal problems, it is necessary to find one or more roots of a secular equation 
(see sec. 10.14): 

y(\) = 1 a„ - b,j\ I = 0 (13-66) 


= aji, bij = bji] i, j = 1, 2, • • •, N. In most cases, 6,7 = but even 
if this is not true in the original form of the determinant it is usually possi- 
ble to reduce (65) to this form by suitable addition and subtraction of rows 
and columns. We shall assume here that X occurs only in the diagonal ele- 
ments. The particular method to be used in finding values of X depends to 
some extent on the special problem at hand. We present three methods, 
each of which has certain advantages. 


a. The Polynomial Method. When (65) is expanded, it obviously 
gives a polynomial of the N-th degree in X. Once this polynomial is 
obtained, either of the methods of sec. 13.25 may be used to find values of X. 
Graeffe’s method is particularly useful when it is required to find all of 
them. To convert the determinant into the polynomial, its expansion 
may be effected by the usual method of reduction of its order (see sec. 
10.3) or by a very convenient procedure which has been described by 
Hicks.22 

According to the latter method, we substitute X = 0, 1, 2, • • •, (A^ -f 1) 
in the given determinant and evaluate each numerically. From these 
{N -f 2) results, yo, yi, J/2, • • Vn+i, a table of differences is formed as 
described in sec. 13.2. An immediate check on the computation of the 
determinants is available for the (N l)-st differences should vanish. 
The polynomial is then given by 


where 


2/(X) = £ 

<-o 


If 

Po * Vo', vt = Xj’oA’i/o; i ^ 1 


( 13 - 66 ) 


Hicks, B. L., J. Chem. Phys 8, .569 (1940). 


(13-67) 
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The coefficients rt, are independent of the values of the elements in (65) 
and may be computed from the following relations: 

c,(s) 


n, = 


s! 


(13-68) 


c,(s) = 1; c,(s + 1) = Ci_i(s) - «Ci(s); Co(s) 

where 

ci(« + 1) = (-l)“s!, s ^ 1; C5_i(s) 

The results may be checked by the identities 

Ci(s) 


0 


(1 - s) 


r 

1=1 


- 1 ; 




S’. I .=1 s! 

Values of the rts through f = s = 6 are given in Table 15. 

TABLE 15 

1 2 3 4 5 


(13-69) 


1 

2 

3 

4 

5 

6 


1 

1 

1 

7 

I 


1 

1 


u 


i 

1 

“4 


“A 

■Aet 


tIxt 

1 

"Tg- 


T 2 0 


Example 23. As an example of the use of this method, we choose the 
secular determinant whose expanded form served as an example for the 
Graeffe method (see Example 20). The determinant follows 

36 - X -4.062 0 0 

-4.062 16 - X 8.216 0 

0 8.216 4 - X 14.49 

0 0 14.49 - X 

Making the substitutions X = 0, 1, 2, 3, 4, 5 in turn and evaluating the 
determinants, we obtain Table 16. The fact that the fifth differences 


2/(X) = 


0 


TABLE 16 


X 

y 

A 

a2 

A3 

A^ 

0 

-117,495 





1 

-105,948 

+11,457 




2 

- 93,743 

+12,205 

+658 



3 

- 81,180 

+12,563 

+358 

-300 


4 

- 68,535 

+12,645 

+ 82 

-276 

+24 

5 

- 56,060 

+ 12,475 

-179 

-252 

+24 
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vanish assures us that the determinants have been computed correctly. 
From (67), Tables 15 and 16, we find 

Po = -117,495 

Pi = 11,547 - ^ = 11,112 

P2 = ^ + ^- + = 490 


300 24 


= -56 


hence the required polynomial is 

y(\) = - 56X^ + 490X2 + 11,112X - 117,495 

in agreement with the result given in Example 20. 


b. Matrix Method. A matrix method, described by Frazer, Duncan 
and Collar (loc. cit.) is sometimes useful. It gives the largest value of 
I X I only, but in quantum mechanical problems this is often all that is 
required. The method does not converge rapidly unless the largest root 
is widely separated from the remaining ones. The procedure is as follows. 
Set X = 0 in the secular determinant and multiply the resulting matrix by 
a matrix of one column. The latter is arbitrary but in its most convenient 
form it contains unity in one row and zeros in the other rows. Extract a 
constant scalar quantity from the resulting matrix product and multiply 
the original matrix with the new one-column matrix. Continue in the same 
way until the scalar quantity becomes constant. This is the required root 
of largest amplitude. 


Example 24. Find the largest root of the secular determinant of 
Example 23. The procedure is apparent from the following. 


36 4.062 0 0 

4.062 16 8.216 0 

0 8.216 4 14.49 

0 0 14.49 0 

For the next approximation, 

36 4.062 0 0 

4.062 16 8.216 0 

0 8.216 4 14.49 

0 0 14.49 0 


1 


36 


1 

0 


4.062 

= 36 

0.1128 

— 


0 

0 


0 

0 


0 


0 


1 


36.46 

1 

0.1128 


= 36.46 

0.1610 

0 


0.93 

0.0256 

0 


0 

0 
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Continuing in the same way, we obtain the results in Table 17. The sixth, 
seventh, eighth and ninth approximations give 36.85, 36.87, 36.88, 36.88, 
hence X = 36.88 . Comparison with Example 20 shows that this result is 
uncertain in the last place. The convergence here is not very rapid since 
the next largest root is 22.341. More rapid convergence could be obtained 
by squaring the original matrix several times before commencing the matrix 
multiplications. The constant value so obtained is then some power of the 
desired root. Once having found the largest root, the next largest one may 
be obtained by the same method. Further details are given by Frazer, 
Duncan and Collar (loc, cit.). 


TABLE 17 

Successive Column Matrices 


Third 

P'ourth ! 

1 

Fifth 

36.65 

1 

36.76 

1 

36.81 

1 

6 85 

0 1869 

7.37 

0.2005 

7 68 

0 2086 

fV-^2 

0 0387 

1 84 

0.0500 

2 07 

0.0562 


0.0101 

0.56 

0 0152 

0.72 

0 0196 


c. IteratiOil Method, Several iteration methods which do not depend 
on matrix properties have been described.^^ Crude approximations to the 
roots of the polynomial are given by the diagonal terms in the secular deter- 
minant. Suppose one of these values, say Xq is substituted in the determi- 
nant for X in every place except one where the quantity Xq — X occurs. 
Now if the determinant is evaluated, the resultant value of X is the next 
approximation to the true value. The process may be repeated as often 
as necessary. 

Example 26. Find a root of the determinant of Example 23 by the 
iteration method. Taking Xq = 36, the determinant becomes 

36 - X 4.062 0 0 

4.062 -20 8.216 0 

0 8.216 -32 14.49 

0 0 14.49 -36 

When this is evaluated, we obtain ^X = 36.893. Substitution of ^X in the 
original determinant gives ^X = 36.896 . The third approximation gives 
the same result. 

Problem. Compute come of the coefficients of Table 15. 

See, for example, James and Coolidge, J, Chem. Phys. 1 , 826 (1933); Cross and 
Crawford, J. Chem. Phy$. 6 , 621 (1937). 
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PART 3. ERRORS AND LEAST SQUARES 

^13.29. Errors. — Measurements are always accompanied by errors. 
They are of two kinds: determinate and random. Those of the first type^^ 
are often constant or systematic, being due to faulty or incorrectly adjusted 
instruments, mistakes on the part of the observer in reading a scale, record- 
ing a number or other similar effects. It is usually possible to discover the 
causes of such errors and to make corrections for them. Random errors, 
on the other hand, are indeterminate and due to unknown causes, but they 
may be treated by statistical methods. As in the previous parts of this 
chapter, we shall often refer the reader to other sources^^ for proofs of 
theorems and results to be given here. 

Suppose several equally reliable measurements of a physical quantity 
yield the numbers Xi, X 2 , • • •, Xn. The corresponding errors are defined 

by 

Xi = Xi - X, X2 = X 2 - X, • • , Xn = X^ - X (13-70) 

where X is the true value of the quantity. Actually, we seldom know^® 
the true value since any experiment made to determine it will be accom- 
panied by random errors. However, in order to proceed further we must 
choose some quantity which is called the most probable value. It will be 
indicated by X, the notation anticipating a fact that we prove in sec. 13.30, 
namely, the most probable value is the average of all the data. Since H is 
not equal to X, the true value, we must distinguish between the error and 
the residual which is defined by 

d\ — Xi — A, ^2 ~ X 2 — ' dn — Xn — X (13—71) 


It is assumed that the errors and residuals with which we are concerned 
are random ones. They are neither systematic nor constant but are equally 
likely to be positive or negative. Small errors are more frequent than large 
ones and very large errors do not occur at all. Under these conditions, 
the errors follow the laws of probability as given by the normal ** Gauss ” 
distribution (see sec. .8.3) > 


w{x) = 




Errors of this kind are discussed in some detail by Grumpier, T. B. and Yoe, J.H., 
“ Chemical Computations and Errors,” John Wiley and Sons, New York, 1940. They 
may be detected in some cases by methods explained by Birge, R. T., Phys. Rev. 40, 207 
(1932). 

See for example, von Mises, Richard, '* Probability, Statistics and Truth,” 
Macmillan Co., New York, 1939. 

An exception is the case where the quantity is exact by definition. For example, 
the true value of the atomic weight of oxygen is 16.00(X) to as many decimal places as 
may be needed. 
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It is convenient for our purposes to change the notation, writing w(x) = N 
and = l/2(r^. The resulting equation 

(13-72) 

V TT 

gives us the relative number of measurements N having an error x. The 
plot of N vs. X is called the Gauss error curve; it is shown in Fig. 1 for 
A = 1 and h = 0.6. From that curve or from eq. (72), we can discover 
the meaning of the constant h which is called the precision index. When 
it is large, N is large for a given small error x and decreases as x increases. 



Thus a high precision index means that a large number of the measurements 
agree closely with the true value of the quantity observed. On the other 
hand, if h is small, a smaller fraction of the results are close to the true value 
and more large errors occur than in the previous case. 

The probability that the error of a single measurement will lie between 
the limits zha is 



This integral occurs so often in mathematical physics that it has been given 
the special name of the error function. It is usually denoted by 

2 r' 
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Hence the probability in question is erf(^). It cannot be evaluated in 
finite form but must be expanded in a power series and integrated term by 
term. Values of the integral as a function of t %,re found in all books on 
probability.^^ 

The special case where the limits of integration are ± « is of consider- 
able interest. The error must lie somewhere within this range, hence the 
probability must be unity. This is readily found to be true when the inte- 
gration is performed. 

The simplest way of evaluating the integral when the limits are ± « 
is the following. Let 


Transforming to polar coordinates we get: dudv = rdrd<t)j = r^, 


P 



TT 


Thus we see that the area under the whole curve (72) is unity. This, 
obviously, is the reason for the constant 2/N/7r. 

-^3.30. Principle of Least Squares. — Suppose n measurements have 
been made, the t-th one having the error Xi. The probability that Xi lies 
between Xi and Xi + dxi is 

Pi = (13-73) 

V TT 


The probability that the n errors xi, X 2 , • * *, Xn occur is the product of n 
terms like (73), for each measurement is an independent event. Hence 
we have 

p = n Pi = e->‘KA+4+---+4)dxidx2 ■■■dxn (13-74) 


Clearly the differentials dx\, dx 2 , • • • are arbitrary, for they may be inter- 
preted as the smallest subdivisions on a scale which is being read. Finally, 
remembering that h is fixed, we see that the probability P is a maximum 
when the exponent of 6 is a minimum ; thus we have 

xj + xl + • • • + xl = Si minimum (13-76) 

as the criterion for the most probable value obtainable from n equally 
reliable measurements of a quantity. This result is known as the Principle 
of Least Squares. 

See also “ Tables of the Probability Functions P{x) and Erf (x)/' Works Prog- 
ress Administration, New York City, 1941. 
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In accordance with that principle, let us determine the most probable 
value of the set of measurements Xi, X 2 , • • *, Xn. Rewrite (75) in the form 

(Xi - X)2 + (X 2 - X)^ + • • • + (Xn - X)2 

differentiate with respect to X and equate the derivative to zero in order to 
obtain a minimum. Since the result is to be the most probable value of X, 
we replace X by the symbol X to indicate that X is chosen to satisfy eq. 
(75). The answer is 

X = +^2+ - +.gn 

n 

As might be expected the most probable value is the arithmetic mean of all 
of the experimental results. It is interesting to note that the error law of 
eq. (72) is, within reasonable limits, the only form^f equation which gives 
the average as the most probable value.^® 

13.31. Errors and Residuals. — If we add n errors, we find, since 
Xi = Xi - X 

2^Xi = nX + ^i 

and from eq. (76) 

X = i ^Xi = X + - z>< (13-77) 

n n 

Also, we obtain for the first residual 

di = Xi - r = Xi - X - i Lx< 

n 

= Zi - - X, - - 0:2 - - X8 (13-78) 

n n n n 

with similar equations for the others. We thus conclude t^t as n increases, 
the second term on the right of (77) becomes smaller and X approaches the 
true value X. In the same way, we conclude from (78) that as n increases, 
the residuals approach the true errors. Actually, if we square n equations 
like (78) and add them, we get 

Zd? = !>!-- (2>.)* 

n 

so that the sum of the squares of the residuals is slightly less than the sum 
of the squares of the errors. 

Suppose two independent quantities {Mi and M 2 ) have been measured 
and the errors in each case obey the normal law. Then the probability of 

A proof is given by Plummer, H. C., ** Probability and Frequency,” Macmillan 
Co., London, 1940, p. 123. 
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an error between zi and Xi + dxi in Mi is 


hi 


Pi = —1=^ e 

V TT 

while the probability of an error between X2 and X2 + dx^ in M2 is 

P2 = ^'^dx2 

V TT 


Since the observed quantities are independent of each other, the probability 
of the simultaneous occurrence of these errors in Mi and ilf2 is 


P = P1P2 

Now suppose that Mi and M2 are combined linearly to form a quantity 

M = aiMi + a 2 M 2 

where ai and a2 are constants. The error in M will lie between 


and 


"h Ot2X2 — X 


( 13 - 79 ) 


oiiixi + dxi) + a2{x2 + dx2) - X + dx 


We recognize the fact that such an error may be composed of any value of 
Xi between db 00 together with the corresponding value of X2 fixed by eq, 
( 79 ). Thus to compute the probability of an error a: in M we integrate 
p = P1P2 with respect to Xi between the limits d= 00 and eliminate dx2 by 
the relation dx = a2dx2 which will be true when the integration has been 
performed over xi. Let us first rewrite p in terms of Xi and x which gives 

p = C exp — /ifo:! — Ai 


where C = hih2/Tr. 


With the further abbreviation 

hlhl 


OC^fl2 


2l2 


we also have 


^ r 2 ^1*2/ , 

p = C exp V ^ 


Let N {x)dx be the required probability of an error in M between x and 
X + dx, then 

N{x)dx = Ce^dx2 J exp|^~ 
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Since ot ^2 — dx we see that 


where 


or 


N(x) = 



r ^ ^1^2 

ajfel + 

i- = ^4.£l 

hy 


(13-80) 


Thus the error law for M is the same as the error law for Mi and M 2 , the 
only difference being in the precision index. The equation is easily general- 
ized by the same method; in fact it may be shown that if 

M = ZaiMi 


the precision index of M is given by 





(13-81) 


We would like to apply this result to the residuals. From (78), we may 
write 

, (n - 1) 1 ”, 

di = Xi - - J^'xj (13-78a) 

Th ^ y =» 1 

where the prime on the summation sign means that the term i = j is 
omitted. The residuals are thus linear combinations of the errors, for di 
corresponds to M in the preceding discussion and 

(n - 1) 1 

ai = ; 02 = “3 = ••• = «» 

n n 


The error law for the residuals is of the form of (72) or (80) 


H 

V~r 




(13-80a) 


and from (81) since h is the precision index for each x,- 


1 ir (n-l)=^ 


+ + 


+ 




1 


[(n- 1)2 + n- 1] 


or 



(13-82) 
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From (82), it is seen that the precision index for the residuals depends on 
both n and h and is always larger than h. Rt ference to Fig. 1 shows that 
the curve of (80a) rises higher in the middle and falls off more rapidly than 
the curve of (72) but as the number of measurements increases the two 
graphs approach each other more closely. 

''y 13.32. Measures of Precision. — Having obtained the most probable 
value of a series of measurements, we need to find expressions for its relia- 
bility. In order to do this we must first consider the case where the true 
value X of the quantity is known. We may then proceed to the more 
practical question of expressing the uncertainty of X in terms of the residu- 
als. If the precision index were known it would be suitable for our measure 
of precision for as we have seen in sec. 13.29, erf (hx) is the probability that 
the error is within the range rbx. However, h has the dimension of 
a reciprocal error and it proves more convenient to use as a precision 
measure a quantity which is inversely proportional to thus having the 
same dimension as the error itself. Three such measures are commonly 
employed; they are the average error (a), the root mean square error (m) 
and the probable error (r). 

The average error is the arithmetic mean of all the errors without regard 


to sign 



n 


(13-83) 


From its definition (see sec. 12.3), it follows that 



X \Ndx 



1 

hy/lr 


(13-84) 


Let us seek the most probable value of h. We recall that P of eq. (74) 
is the probability of the simultaneous occurrence of the errors xi, X 2 , • • •, 
Xn* Hence we must make P of that equation a maximum. Taking the 
logarithm of (74) we see that the most probable value of /i is that quantity 
which makes 

</} = n log h — h^^Xi 


a maximum, or 


d(/) 

dh 


J - 2A La:? = 0 

h 


hence 





(13-85) 
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is called the root mean square error. Comparison of (84) with (85) shows 
that 


a « 



(13-86) 


The root mean square error is frequently used in mathematical statistics; 
there it is called the standard deviation and indicated by cr (see sec. 12.3, 
especially problem a). 

The probable error is defined as that error r such that one half of the 
errors of n observations are greater than r and one half are less than r. 
Thus it is given by the integral 

erf{hr) = | (13-87) 

for this says that there is an equal chance that a given error lies within ±r 
or outside these limits. From tables of the integral, we obtain 


0.4769363 • • • 


(13-88) 


Combining this result with (85) we get for the probable error 

r = 0.6745 = 0.6745m (13-89) 

From eqs. (86) and (89) we can readily obtain all relations between 
a, m and r. They are 

r = 0.4769/i~‘ = 0.6745m = 0.8453o 

m, = 0.7071)1-* = 1.4826r = 1.2533a 

a = 0.5642A-* = 0.7979m = 1.1829r 


The geometric significance of the three precision measures is also of interest. 
The average error a is the abscissa of the center of gravity of the area 
bounded by the error curve and the axes x and N of eq. (72). To see this, 
let xq be the center of gravity of that area, then 


Xo = 


f 


xNdx 


/ 


Ndx 


1 

hyfrr 


which follows from (84) since ^ Ndx = 1. 


a 


The root mean square error is the radius of gyration of the same area 
about the N axis; it is also the abscissa of the point of inflection of the 
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error curve, as will now be shown. For the point of inflection, d^N/dx^ = 0 
and from (72) 



Thus, 

or 


(1 - 2AV) = 0 


1 

X = ±: 7= = ztm 

hV2 


From the definition of r, it follows that the abscissa x = r corresponds 
to the ordinate which bisects the area of the error curve (72) between 0 
and 00 . 



The relative sizes and positions of these three measures are shown in 
Fig. 2 where we draw only that half of (72) corresponding io positive 
values of x. It is perhaps not amiss to comment on the most appropriate 
measure to use. The average error recommends itself because of the ease 
with which it is computed. The probable error is less easy to calculate^® 

Convenient tables of 0.6745/ Vn as a function of n may be found in “ Handbook 
of Chemistry and Physics/' Chemical Rubber Publishinj^ Co., Cleveland, Ohio. 
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but is perhaps more often used than the others in chemical and physical 
literature. As may be seen from Fig. 2 it is the smaller of the three and is 
thus more flattering than a or m to a set of experimental data. There is 
little choice between the three measures on theoretical grounds. 

It is often of importance to find some estimate of the probable error of 
an adopted precision measure itself. The result has been obtained by 
Gauss^® who shows that the relative error of r is 

0.4769 
V n 

With 10 measurements, it is seen that the probable error is uncertain by 
about 15 per cent while for even 500 measurements the uncertainty is 
2 per cent. It thus follows that it is seldom if ever of meaning to state 
the probable error with more than two significant figures, for usually one 
of these is uncertain. 

13 . 33 . Precision Measures and Residuals. — From the equations of 
the previous section it is a simple matter to express the precision measures 
in terms of residuals. Suppose Xi, X2, • • •, Xn are n observations. If they 
follow the error law, the residual di is given by eq. (78a) and the index of 
precision of the residuals by eq. (82). Therefore, the average error 

^ ^ I n LMt 1 ^ Z| dj I 

“ " /i\/t - 1) n \/n(n - 1) 

Similarly, 

= J_= / n 

hV2 yin - 1) n \ (n - 1) 

and 

r = 0.6745TO = 0.6745, / 

\(n- 1) 

The differences between eqs. (83), (85), (89) and (83a), (85a), (89a) 
should be carefully noted. In many cases, the deviations are used in 
place of the errors to get a from (83) rather than from the correct 
eq. (83a). The difference is negligible, of course, in most cases. 

The most probable value or arithmetic mean also follows the error law. 
Its index of precision is obtained from (81) where a* = 1/n, hence 

1 ^ /yp 2\ _ ^ 

^2 - f ^2 

A derivation of it is given by Plummer, loc. cit. 


(13-83a) 

(13-85a) 

(13-89a) 
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Thus if a, m and r refer to the individual members of a set of n measure- 
ments, the corresponding precision measures relating to the arithmetic 
mean are 

A — ± —-j= , Af — db —7= , R = dz —7= 

Vn Vn Vn 

It will be observed that the precision varies as the square root of n. There- 
fore comparatively little is gained by increasing n, for in order to change 
the precision by one decimal point n must be multiplied by 100. This is in 
accord with common sense which suggests that instead of making 100 
measurements it is more economical and reasonable to seek an improvement 
in the experimental method. A graph of r versus n is shown in Fig. 3 . It 
will be seen from that curve that it is seldom worthwhile to make more than 
10 measurements of a given quantity by the same method. 



13 . 34 . Experiments of Unequal Weight. — It often happens that the 
results of one experimenter are more reliable than those of another. This 
may be due to superior method or apparatus, to greater experience with the 
operations involved or to other reasons. Moreover, because of particu^ 
larly favorable conditions, the same investigator may obtain better results 
at some times than at others. In all such cases, more weight is attached to 
some of the data than to the remainder of them. For ^ample, if one result 
X\ has a weight twice that of ^2, then the average X — {2Xi + X2)/3. 
A result of weight w is thus equivalent to w results of unit weight, or we say 
that a result of large weight has a high precision index. 

If the j~th measurement is of weight wjy the weighted average or most 
probable value is 


ZwiXi 
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The probable error for the value of weight wy is 


Pv>i 


±0.6745 




'Lwidj 
(n — l)wj 


and for the weighted average 


Pu, 


±0.6745 




(n - 


It is possible to determine the relative weights to be attached to the 
individual measurements since the weight Wj is inversely proportional to 
The usual custom is to assign weights arbitrarily. 

"^3.36. Probable Error of a Function. — In general the results of several 
independently measured quantities are combined to give the final value of 
the physical constant desired. Suppose X, F, • • • have been obtained as 
the average value of certain quantities with probable errors Pi, P2, • 

If they are combined to give Z, where ^ 


Z = /(X,F,. . .) 


then its probable error is 

p = ViPxdZIdX)^ + {PyeZIdY)^ + ■■■ 


We record a few special cases for convenience of reference. 

1. Z = X±F; P = ±Vp% + PI 

2. Z = XY; P = ±V(ZPk)^+ (FP^)® 

3. Z = XIY; P = db^VY^Pl + X^P\ 


4. Z = a + bX. Suppose we know the value Zi with its probable 
error pi at the point X = and Z2 with error p2 at X = X 2 . We wish to 
fit the two points to a linear equation. Then 


P« = 
Pi,= 
Pz = 


// pi(X2 - X) Y , / P2(X1 - X) Y 
\V (X2 -X^)) V {X, - X2 ) ) 


where Pa, Ph and Pz are the probable errors in a, b and Z, respectively. 
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13.36. Rejection of Observations. — Occawsionally a single measurement 
from a set differs so widely from the others that the experimenter is tempted 
to discard it. A simple rule in such cases, based on statistical methods is 
the following: Calculate the average of all the data including the suspected 
measurement. Find each rcvsidual and calculate the probable error of a 
single determination. If any residual exceeds /ye times the probable error 
it may be rejected, the supposition being that the error cannot be a random 
one. The reason for the use of this rule is as follows. Suppose the proba- 
bility of an error as large as Xx in the quantity measured is 0.001, then the 
chance that an error as large as 0.1 per cent will not occur is 0.999. Let us 
then determine the value of hx for which erf(/ix) = 0.999. From tables of 
this integral we find^^ 

hx - 2.326 

Now from eq. (88) we have 

hr = 0.4769 

thus 

X = 4.9r 

We conclude that the probability of an error 5 times as great as the prob- 
able error of a single measurement is less than 1 in 1000 hence the somewhat 
dogmatic rule for rejecting such measurements. , 

13.37. Empirical Formulas. — As mentioned in sec. 13.1, there is con- 
siderable advantage in representing experimental data by means of equa- 
tions, the correct form of them being often suggested by theoretical con- 
siderations. In other cases, plots of various functions of the data may 
indicate a suitable form. When this question is settled, the next step is to 
determine the constants in the equation. Sometimes a graph may be used 
for this purpose, for if the equation is linear it is only necessary to deter- 
mine the slope and intercept of the curve. In more exact work, numerical 
methods are needed. 

a. The Method of Averages, Suppose that the quantity y has been 
observed as a function of another quantity x, the resulting numbers being 
2 / 1 , 2 / 2 , • * •, 2/n- It has been decided that a polynomial of the m-th degree, 
m < n is a suitable equation 

2/ = A + + • • • (13-90) 

Divide the measurements into groups equal in number to the unknown 
constants, placing an equal number of results in each group if this is 
possible. Add the equations in each group thus obtaining a set of simul- 
taneous equations equal in number to the number of unknowns. The 
equations may be solved by the methods of sec. 13.26. 

See footnote, p. 489. 
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It will be found in general that this procedure is quite satisfactory. 
The resulting constants are different for different groupings of the data, but 
the simplest such grouping is usually better than any other. If there are a 
large number of results or if the polynomial is of degree higher than four or 
five this method is nearly as good as the method of least squares and entails 
considerably less calculation. 

b. The Method of Least Squares. Suppose as before that n values are 
available for y but that the chosen equation is of a more general form than 
(90), 

y = f(x,A,BA- • •) (13-91) 


If there are n constants we may obviously fit the data exactly to such an 
equation but usually there will only be m < n constants. Thus the calcu- 
lated value of y will not agree with the observed one. I^et 

Vi = Vi (calc.) + di 

where yi is an observed y and yi (calc.) is the corresponding calculated one 
using the constants finally adopted. In accordance with the principle of 
least squares we wish to make 

a minimum (13-92) 

Let us now assume that we have found approximate values of the con- 
stants by graphical means or otherwise so that 

A = Aq a', B = Bq h) C — Co c, 
a, b,c, ■ being small correction terms. Then the i-th equation of (91) 

fi(A,B,C,- • •) = Vi - di 


may be written as 

fi{Ao,Bo,Co,- • •) + o + b -b + • • • = y,- — d,- (13-93) 


where we have discarded derivatives of second and higher order. Using 
the abbreviations 


dfi 

dAo 


= 


dBo 


dCo 


= Wi; 


and 

Vi — fi(-Ao,BQ,Co,’ • ') = Fi 


(93) becomes 

w,o -b Vti -b w»c + • • • — F,- -b dt = 0 
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where m„ p,-, Wi, F ,• are known and a, b, c, di are unknown. Since we wish 
(92) to hold we must require that 

n 

J^{u,a + w,6 + lOiC H — Fi)^ = </)(a,6,c) 

i «1 


be a minimum or that 
d(f> _ 

— = 2i,(uia + i>i6 + w.-c + • • • — Fi)ui = 0 
da 

dd) _ 

— = 2J2(uia + v^b + WiC-\ Fi)vi = 0 

d<t> ^ 

~ = 2^(uta + + WiC + ••*““ Fi)wi = 0 

oc 


(13-94) 


These equations (when divided by two) are called the normal equations. 
There will be as many of them as there are unknowns. 

In many cases, the chosen relation between x and ^ is a polynomial, 
when some simplification in the procedure is possible. The original equa- 
tions corresponding to (91) will be of the form 

A + Bx, + + . . . - y, (13-95) 

It is still worthwhile to use approximate values of the constants for then the 
normal equations will be easier to handle. If this is done (95) becomes 

a + bxt + cXi + ’ ‘ = Ft (13-95a) 

In either case, the normal equations may be written down without differen- 
tiation. They are found as follows: (1) multiply each equation of (95) 
or (95a) by the coefficient of the first unknown (unity since we are speaking 
of A or a) and add the resulting n equations; (2) multiply each equation 
by the coefficient of the next unknown {xi) and add these equations; (3) 
continue in the same way until each equation has been multiplied by the 
coefficient of each unknown. The resulting normal equations which are 
identical with those obtained by the procedure leading to eq. (94) may then 
be solved by the methods of sec. 13.26 to obtain the constants. The final 
equation should always be checked by using it to compute each known yt. 
The sum of the squares of the residuals should be small and the algebraic 
sum of the residuals themselves should be nearly zero.^^ 

Such a procedure will show how closely the curve fits the known points 
but says nothing about the reliability of the curve at other places. In the 

** Further details of the method of least squares are given by Brunt, D., “ The 
Combination of Observations,” Cambridge Press, 1917. He describes several schemes 
for checking the calculations and evaluating the constants with their probable errors. 
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important case of a linear equation, y = a -\-hx the formulas^® are com- 
paratively simple. The probable errors in a and b are 


Pa = r 


e 



Pb 



U - 0.6745 ; D = n^x? - (Zx,)"* 

The error in y at any point x (x not necessarily a measured value) is 


** See Birge, loc. cit. 


P. = 



(Xj - x)^ 

D 



CHAPTER 14 

LINEAR INTEGRAL EQUATIONS 


14.1. Definitions and Terminology. — An integral equation is one which 
contains the unknown function behind the integral sign. Its importance 
for physical problems lies in the fact that most differential equations together 
with their boundary conditions may be reformulated to give a single integral 
equation. If the latter can be solved, the mathematical difficulties are 
not appreciably greater even when the number of independent variables 
is increased, while differential equations, such as Laplace^s, are considerably 
more complex in three dimensions than in two. The theory of integral 
equations also furnishes a uniform method for the study of the eigenvalue 
problems of mathematical physics. 

A linear integral equation of the third kind^ the most general type con- 
sidered, has the form 


g{x)<t>{x) = f(x) + X K{XyZ)(t>{z)dz (14-1) 


The known functions are g{x)y f{x) and K(x,z)j the latter being called the 
kernel or nucleus. The limits of integration a and b are either known func- 
tions of x or constants; X is an absolute constant or a parameter. It is 
desired to find the unknown 0 as a function of the independent variable x. 

Four special cases of (1) have been most widely studied. In FredholnCs 
equation of the first kind, g{x) = 0, and in his equation of the second kind, 
g(x) — 1 ; in both cases a and b are constants. Volterra^s equations of the 
first and second kind are like Fredholm^s equations except that a = 0, and 
f) = X. If f{x) = 0 in either case, the equation is said to be homogeneous. 
When one or both limits become infinite or when the kernel becomes infinite 
at one or more points within the range a to b, the equation is called singular. 

Nm-linear integral equations may occur in the form 

= /(a:) + X r /C(x,z)<#»"( 2 )dz 
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We our discussion here to linear equations in one variable where the 
unknown <l> enters only to the first power. Our plan is to present first the 
purely formal mathematical methods of solution. We then show how to 
convert differential equations into integral equations and apply the theory 
to certain physical problems. 


GENERAL METHODS OF SOLVING INTEGRAL EQUATIONS 

14.2. The Liouville-Neumann Series. — a. Fredholm' s Equation of the 
Second Kind. Suppose the given integral equation is 


4>{x) = fix) + K{x,z)<i>{z)dz 


(14-2) 


where x and z are real variables with a < x < a < z < h] K{XyZ) and 
f{x) are continuous but may be complex. We attempt to solve (2) by 
means of a power series in X: 


= L X>n(a:) 

n =0 


(14-3) 


Substituting (3) into (2) and equating coefficients of equal powers of X we 
obtain 

<t>o{x) = f{x) 


= J K(XyZ)<l>oi^)dz 
<l>2{x) = jK{XyZ)(t>i(z)dz 


(14-4) 


<Pn(x) = J K{XyZ)(t>n.^i{z)dZ 

Remembering that both x and z are restricted to lie between a and 6, we see 
that the kernel and f{x) must have maximum values, for we assumed them 
to be continuous. Let these maxima be given by | K{XyZ) | < Af, | f{x) | 
< N. Then it follows that 

Uo I < at, I 01 1 < NM{h -a),--; \<t>n\< N[Mib - a)]" 

^ More complete treatments of integral equations are given by Lovitt, “ Integral 
Equations/’ McGraw-Hill Book Co., New York; Vivanti-Schwank, Lineare Integral- 
gleichungen,” Helwingsche Verlagsbuchhandlung, Hannover, 1929. Chapters on the 
subject may be found in Whittaker, E. T., and Watson, G. N., “ Modem Analysis,” 
Fourth Edition, Cambridge University Press, 1927; Courant, R., and Hilbert, D., 
** Methoden der Mathematischen Physik,” Vol. 1, J. Springer, Berlin, 1931. 
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If 


1^1 


1 

M(b - a) 


(14-5) 


the series (3) which is called the Limville^N eumann series converges uni- 
formly and is the unique continuous^ solution of (2) within the range 
a < X < h. 

In order to obtain the solution in more convenient form, we define the 
iterated kernels'^ 


Kiix,z) = K(x,z) 

Kiixfi) = J K{x,y)K{y,z)dy 




(14-6) 


J K{x,y)Kn-i{y,z)dy 

ff ■■■ / • • • K(yn-u 


z)dyidy2 ■ ■ ■ dy 


n-1 


Introducing these functions into (4) we may write 
<t>iix) = J K{x,z)f(z)dz 

<t>2ix) = J K2ix,z)fiz)dz (14-7) 

4>n{x) = J K„(x,z)f{z)dz 

By the same means as before we see that ] Kn{x,z) | < M”(6 — a)"“^; 
hence if (5) is fulfilled we can construct a uniformly convergent series 
called the resolvent {losender Kem). 


Kix,z;\) = £ \"Kn+\(x,z) (14-8) 

n —0 

From (3), (6) and (8), it follows that the solution of the integral equation 
is 

<l>(x) = fix) + xj* Kix,ziX)fiz)dz (14-9) 

* Continuous solutions of the equation may exist even if (5) is not true. There may 
also be discontinuous solutions. For these exceptions, see Lovitt, loc. cit., pp. 13 and 21. 

^ Henceforth, we usually omit limits of integration unless they are different from 
a and b. 
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The resolvent and have properties of a reciprocal nature as may 
be seen by comparing (2) and (9). If <l>ix) is the unknown, (9) is the solu- 
tion; if fix) in (9) is the unknown, (2) is the solution. These properties 
are even more apparent if we rewrite (8) in the form 

Kix,z-;K) - Kix,z) - X i = K(x,y)K„+,(y,z)dy 

n =0 n =*0 «/ 

or 

Kix,z-}.) - Kix,z) = xj Kix,y)Kiy,z-;k)dy (14-10) 

Similarly, we may obtain 

K{x,z;\) - K(x,z) = K(x,y;\)K(yyZ)dy 

b. Volterra^s Equation of the Second Kind. Application of the Liouville- 
Neumann series may also be made in this case. Suppose 

= f(x) + X f H(x,z)<t>(z)dz (14-11) 

do 

is given. Then if 

"-'I' 

we may write an equation similar to (7) for 2 < x 
* <t>n(x) = J" Kn(x,z)f(z)dz 

and also an equation like (6) 

Kn{x^) = f K(x,y)K„_i(y,z)dy 

= J Kix,yi)dyi J Kiyuy2)Kn-2iy2,z)dy2 

The solution of Volterra’s equation obtained in this way converges for all 
values of X. 

c. Volterra^s Equation of the First Kind. Under certain conditions, 
Volterra’s equation of the first kind may also be solved by the Liouville- 
Neumann series. With a change of notation, we write this equation as 

g(x) = X r K{x,z)<t){z)dz (14-12) 

do 
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Differentiation with respect to x results in 


j r% X 

— <t>{z)dz + \K(x,x)<t>ix) 

0 


which is similar to (11) provided K{x,x) ^ 0 and 

dK 


/(^) = 


q\x) 

\K{x^x) ^ 


H{x,z) 


dX 

\K{XyX) 


A similar conversion of (12) to an equation of the second kind may be made 
by partial integration. Special methods must be used when K{XjX) = 0. 

An important case arises when the kernel becomes infinite at one or more 
points within the range of x and z. It is then necessary to transform the 
equation to remove the singularity. As a typical example, consider the 
kernel 


K{x,z) = - — ; 0 < a < 1 
(x - zY 

which is infinite when x = 2 . Substitute this kernel in (12), multiply both 
sides of the equation by dx/{u — x)^"“ and integrate with respect to x 
from 0 to u. If for simplicity we also take X = 1 the result is 


p “ g{x)dx _ j 

r ^ dx , 

4>{z)dz 

1 

f 

1 

O 

0 (u - J 

0 lx - z)“ 


The justification of the change of limits and order of integration in the last 
equation is the following. Since x varies from 0 to u, and for every value of 
X, the variable z goes from 0 to x, the situation is equivalent to the varia- 
tion of z from 0 to u and the variation of x from ziou for every value of z. 
The same result is also easily obtained from a figure. If we are integrating 
F(x, 2 ) over the shaded area of Fig. 1 we see that 


J r*'* p u p u 

dx I F(x,z)dz =1 dz I F(x,z)dx 

0 ^0 ^0 ^ z 


The integration^ over x results in Tr/sin ax, hence the solution of the equa- 
tion fc 




sin qTT d r r ^ g{x)dx "1 
TT (w — x)^“^J 


^ See sec. 3.2. 
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Problem. Solve the equation * 

Neumann series. Hint: substitute (z — x) 
Ans.: 4>(j^) *» sinaj. 


X -f J* (z — x)<i>(z)dz by the liouville- 
* m; (y — a;) — 0. 



14.3. Fredholm’s Method of Solution. — a. The Inhomogeneous Equa- 
tion, Fredholm studied the solution of a system of linear equations in 
n variables and observed that as n becomes infinite the results are appli- 
cable to linear integral equations. Although the reasoning is simple, the 
derivation of the final formulas requires considerable space. We therefore 
show only how the method may be used, referring the reader to other 
sources® for the intermediate steps and proofs. 

The unique and continuous solution of (2) is of the form (9), where the 
resolvent is the ratio of two infinite series in X. In fact 


where 


K’(x,2;X) 


D(\) 


(14-13) 


Z)(x,z;X) = K{x,z) + i D„(x,z)\” (14-14) 

n=l n\ 


” (- 1 )" 

J5(X) = Z I>nX» 

n =0 


(14-15) 


The coefficients Z>n and the functions Dn{XyZ) may be found from the 
following recurrence relations. Starting with K{x,z) = Do{x,z) we obtain 

^ See, for example, Kowalewski, G., “ Einftihrung in die Determinantentheorie, 
einschliesslich der unendlichen und der Fredholmschen Determinanten,” Leipzig, 1909, 
or Whittaker and Watson, loc. cit. 
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D\ from the integral 

Dm ^ j* Dm^i(x,x)dx (14-16) 

We then find Di (XyZ) from 

DmiXyZ) = K{XyZ)Dm ~ m J' K(x,y)Dni^i(y,z)dy (14-17) 

which^enables us to determine D 2 from (16). Continuing in this way, all 
of the coefficients are calculated. In many cases, depending on the explicit 
form of the kernel, the series (14) and (15) contain only a finite number of 
terms. 

One distinct advantage of the Fredholm method is that (13) is uniformly 
convergent for all values of X unless D(\) — 0. If that happens, the 
procedure which we have described is inapplicable since the resolvent 
vanishes. Actually, there is then no solution unless certain other condi- 
tions are met. We omit the necessary extension of the Fredholm theory 
but return to the problem in sec. 14.4b. 

b. The Homogeneous Equation. If f(x) = 0, so that the given equation 
is homogeneous, 

<l)(x) ^ ^ K{x,z)<t>{z)dz (14-18) 

Then cursory inspection of the solution (9) leads to the conclusion that 
= 0. This is generally the case but we shall see that when the pa- 
rameter X assumes certain special values we are led to a situation similar 
to the eigenvalue and eigenfunction problem described in Chapter 8. If 
D(X) = 0 and D(x, 2 ;X) 9 ^ 0, cq. (13) indicates that /C(x, 2 ;X) approaches 
infinity and we may still find non-vanishing solutions of (18). Equating 
the right side of (15) to zero, we have a polynomial in X with n roots, multi- 
ple or distinct. They are the eigenvalues of the kernel, and the correspond- 
ing solutions of (18) are the eigenfunctions. Assuming that all eigenvalues 
are distinct, choose one of them, say X,, substitute (13) in (10) and multiply 
by 2)(Xt), which gives 

D(x,2;Xi) ^ f K{x,y)D{y,z-;Ki)dy (14-19) 

If we compare this equation with (18), we observe that jD(x, 2 ;X,),.for any 
constant z, is a solution of the homogeneous equation, i.e., 

4>i{x) *= D(a;,c;X») (14-20) 

Having found a solution for Xi, we proceed to find the others for the re- 
maining eigenvalues in the same way. Linear combinations of them form 



14.4 


LINEAR INTEGRAL EQUATIONS 


510 


the general solution 

<t)(x) = ^Cfn,<t>m{^) (14—21) 

m»l 

where the Cm are arbitrary constants. 

It is true that D{XjZ]\i) may vanish identically in x and z or vanish 
because of an unfortunate choice of the constant value of z. In the former 
case, non-trivial solutions may often be found by more complicated meth- 
ods; in the latter case, we simply choose another z c. When the eigen- 
values are degenerate further modifications of the method are required. 


Problem a. Solve by the Fredholm method : 

4>{x) = X -f X ^ (x 4- z)<t>(z)dz 


Am.: 

Problem b. 




2) - 4X 


I 


Show that I D{XfX\ \)dx = 


X2 + 12X - 12 

dD{\) 


d\ 


Hint: use eq. (16). 


Problem c. Set fix) =0 in the equation of Problem a and solve. Hint: 
that D(x,c; X) = (2/«)(2 - t)(cr + l)(<c + l); x = 2t{2 - t); t = ± Vi. 
Ans.: ^±(x) = C±(l ± V^x). 


show 


14.4. The Schmidt-Hilbert Method of Solution. — In many physical 
problems, the kernel has the property of being symmetric^ i.e., K{x^z) 
=■ K {ZjX). In such cases,® the integral equation may be solved by a method 
which, is somewhat different from any of those in the preceding sections. 
We find it convenient to limit the discussion to kernels which are real as 
well as symmetric. 

a. The Homogeneous Equation. A real symmetric kernel has at least one 
eigenvalue and it may have an infinite number. We omit the proof of 
these facts. 

The eigenfunctions of the homogeneous equation (18) are mutually orthog- 
onal. Suppose \i and \j are two different eigenvalues corresponding 
respectively to eigenfunctions <t>i and <#>>. Then we may write 

4>i{x) = Xi J K{xyz)<t>^{z)dz 

<t>j{x) = J K{x,z)<t>j{z)dz 

® Unsymmetric kernels may often be symmetrized; see sec. 14.7 or Courant-Hilbert, 
loc cit. 
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Multiply the first equation by and the second by 0,, then integrate over x 
J (i)i(x)(t>j(x)dx = X,. J K{x,z)4>i{z)<t>j(x)dzdx 

h J K (x,z)<t)i{x)<t>j{z)dzdx (14-22) 


The last integral may be written as \j J K{z,x)(t>^(z)(t)J(x)dzdx by inter- 
changing X and z. Thus if K(x^z) = K{z^x), the two integrals of (22) are 
identical and since \ 9 ^ \j, it follows that 

J 4>i{x)<i>j{x)dx = 0 (14-23) 

As we know from Chapter 8 such functions may always be normalized. 
Henceforth, we will assume that this has been done and will indicate the 
orthonormal solutions of (18) by 4>t(a:), so that 

J* ^i{x)^j(x)dx = 8ij (14-24) 

The eigenvalues of a realy symmetric kernel are all real. Suppose the 
solution of the homogeneous equation (18) were of the form 0 (.t) = 0i(x) 
+ i<t> 2 {^) and one of its eigenvalues were also complex, X = a + iff. We 
could then take the complex conjugate of (18) 

<t>*{x) = X* J* K{x,z)4>*{z)dz 

But according to (23) 

(X - X*) J <l>(x)<t>*(x)dx = 0 
or 

20 + <pl)dx = 0 

which means that P = 0 and the eigenvalues must all be real. 

Arbitrary functions of Xy including the kernel for fixed Zy may be ex- 
panded in terms of the eigenfunctions 

K{XyZ) ^ZCi^iix) (14-25) 

The functions 4>i(x) form a complete set as explained in Chapter 8. As 
also shown there, the coefficients of (25) may be found by integrating that 
equation term by term. Thus, using (24), 

Ci = f K{x,z)^i{x)dx 
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^i(z) = X* K(z,x)^i(x)dx = \i J K{xyz)^i(x)dx 


since the kernel is symmetric, hence 


and (25) becomes 


K{x,z) 


^i{x)^i{z) 


(14-26) 


b. Solution of the Inhomogeneous Equation. We are now ready to con- 
sider the inhomogeneous equation (2) ; for that purpose we assume that 
we have found the eigenfunctions of the homogeneous equation by the 
method of sec. 14.3b. Let them be ^i{x). Then we may write 

<t>{x) - fix) =T^ai^iix) 

r (14-27) 

= J - fix)]^iix)dx 

where (l>ix) and fix) both come from (2). Now substitute (27) in (2) to 
give 

2 «i4>t(a:) ^ KiXjZ)fiz)dz + X]^ axjKix^z)^iiz)dz (14-28) 

We may also expand fix ) : 

fix) Pi^tix)\ jS* = J fix)^iix)dx (14-29) 


and obtain by using (26) and (24) 


r K{x,z)'^ (i^^^{z)dz = f D |3,«i»,(2)cfe 

V i Ki j 


, |3^^^ix) 


with a similar expression for the last integral of (28). That equation 
becomes 


22 cii^iix) = XL 4>i(a:) + XL T^^^iix) 

hi hi 


(14-30) 


Because of the independence of the functions 4>t, the coefficients of each 
may be equated on both sides of this equation. Hence, 


. Oii 

ai = 

U^i xj 
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or if X Xi, 

= (14-31) 

This method, which was devised by Schmidt and Hilbert, thus gives a solu- 
tion for X 7 ^ Xi, for we may substitute (29) and (31) into (27) and obtain 

0(a;) = f{x) + XD 4>i(x) 

I A 

= fi^) + XL (14-32) 

As we have noted before, the homogeneous equation for X has the 
solution </)(x) = 0 since /(x) = 0. 

We must still consider the exceptional case when X is one of the eigen- 
values of the kernel. Suppose, for example, that X = Xq is an m-fold 
degenerate eigenvalue, i.e., Xq = Xi, X 2 , • • •, X^i. Then (2) reads 

(t>(x) = f{x) + Xo J K{x,z)(i>{z)dz 

and by the preceding method we obtain 


^iXo 



where i is not one of the numbers 1, 2, • • •, w. When i equals one of thase 
integers wo have, if a, is to remain finite, 

= ^2 = * * * = ^ra = 0 

which in turn requires that 

= J f(x)^j{x)dx = 0; j = 1, 2, • • •, m (14-33) 

Thus if Xo is an m-fold degenerate eigenvalue, the inhomogeneous equation 
has solutions only if f{x) is orthogonal to the corresponding eigenfunc- 
tions The general solution of the equation is then 

<i>{x) = /(x) + XoD' J' /(z)4>,(z)ctej + Ci4>i(x) H 

-f C,„<l>,„(x) (14-34) 

where the prime on the summation sign means that the terms i = 1, 2, • • •, A/f 
are to be omitted from the sum. 

Problem. Find the solution of the equation of Problem a, sec. 14.3, by the Hilbert- 
Schmidt method for X not equal to an eigenvalue Show that there are no solutions 
when X is an eigcnvaliif‘ 
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14.6. Summary of Methods of Solution. — a. The Homogeneous Equa- 
tion, 

1. D{\) 5 ^ 0. No solution except <l>(x) = 0. 

2. D{\) =0; D{x,z]\) ^ 0. Solution is given by (20) and (21). 
The resulting eigenfunctions are orthogonal and may be normalized. To 
each solution belongs an eigenvalue. 

b. The N on-homogeneous Equation, 

X \i 

1. Solution given by (9) provided (5) holds. 

2. For all values of X \i solution is given by (9) and (13). 

3. If K{XyZ) = K{ZyX)y solution is (32). 

X = \i 

4. K{XyZ) = K(ZyX)\ solution is (34). Special methods have been 
given for Volterra\s equations of the first and second kinds. 


USE OF INTEGRAL EQUATIONS 


14.6. Relation between Differential and Integral Equations. — We have 
shown in the previous sections how integral equations of the more common 
types may be solved. We now propose to study the relation between 
differential and integral equations so that we may state physical problems 
in either form at will. For this purpose consider as a simple example the 
second order differential equation 


y" = fixyy) 


Integration results in 


y'(x) = J'^f{z,y(z)\dz + Cl 

y{x) = + Cix + Ci 


( 14 - 35 ) 


( 14 - 36 ) 


An alternative form of the last expression^ is 


y{x) = f (x - z)f{z,y{z)]dz + g(x) 

v'o ■ (14-37) 

g(x) = Cix + C 2 

which is recognized as a non-linear Volterra equation of the second kind 
with y{x) as the unknown. 

’’ To show that the two equations for y{x) are identical, differentiate the last one 
with respect to x; the result is (36), 
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The boundary conditions which are needed to determine the two 
integration constants, Ci and C 2 , may be either of two types: (a) y and 
are fixed at one point within the range of integration, say at a? = 0; (6) y is 
fixed at two points. The first case is simple, for if y(0) = a, y\Qi) = 5, 
(37) becomes 

y{x) = I {x - z)f\z,y{z)]dz hx + a 

•^0 


The second case leads to greater difficulties. Suppose ?/(0) = a,y(l) = 5; 
then C 2 = a, as before. For x = 1, we have 

b = 2/(1) = r (1 ~ 2 )/d 2 + Cl + a 

or 

C, (b a) f (1 ~ z)fdz 

where we abbreviate /{z^yiz)} by the single symbol /. Substituting the 
values of Ci and C 2 into (37) we obtain 

y(x) = h{x) + f (x - z)fdz + x f (z - l)fdz 

= h(x) + /* (x — z)fdz + X f (2 - l)fdz + x f (z — l)fdz 

if ^ ifo 'f r 

= h(x) + r z{x — \)idz + f x(z — l)fdz (14-38) 


where h{x) = a + (6 — a)x. We thus see that in this case, if we are 
willing to divide the range of x into two parts with a different kernel for 


each part, 


K{x,z) 


— z(x — 1) X > z 
= x(z — 1) X < z 


eq. (38) becomes an integral equation of the Fredholm type 


y{x) = h{x) + 



K(x,z)f{z,yiz)\dz 


Problem. Convert the following differential equation and its boundary conditions 
to an integral equation. 

y" +y = 0 ; 2 /( 0 ) = y''(0) = 0 ; y'(0) = 1 


Am.: 2 /( 55 ) = * + ~ 
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14.7. Greenes Function. — Our problem now is to find a general method 
of constructing such kernels. For this purpose we consider the inhomo- 
geneous Sturm-Ldouville equation 

L{u) = (pu^)' — qu — —(t>(x) (14-39) 

the homogeneous form of which has been discussed in Chapter 8. We 
will later prove that a certain function G{XyZ) called Green^ s function is the 
kernel of a homogeneous integral equation which is equivalent to (39) 
and its boundary conditions. At the moment we study the means of 
finding Greenes function. For reasons which will presently be clear, it is 
defined to have the following properties: 

a. For fixed z, it is a continuous function of x and satisfies all of the 
boundary conditions to be imposed on u. 

b. Both G' and G" are continuous at every point within the range 
of X except at x = z, where it is discontinuous® so that 

G'(z + 0) - G'(z - 0) = --l/ 7 >(z) (14-40) 

c. Except at X = z, G(x,z) satisfies the differential equation L{G) = 0. 
We now proceed to find such a function G. Suppose two linearly inde- 
pendent solutions of 

L{u) = 0 (14-41) 

are known. If these are U\{x) and U 2 {x) their independence may })e 
recognized by the fact that the Wronskian, U\U 2 0 (see sec. 

3.13), and the general solution of (41) is 

u{x) = CiUi -f C 2 U 2 


Let us divide the range of x into two portions; a < x < z, z < x < b, 
and write 


( wj = (A — a)ui(x) + (B — p)u 2 {x)] X < z 
\uii = (A + ol)\ii{x) + {B + P)u 2 {x)\ X > z 


(14-42) 


where A, a, jB, /S are constants to be so chosen that u, which will later be 
taken as our Green function, satisfies conditions a, b and c. If we im- 
pose on this function the requirements a and b, we must have 


u/(z) - uii{z) 
'W//(z) - ujiz) = -l/p(z) 


(14-43) 


® The notation G\z + 0) means that Q' is evaluated at the discontinuity when it 
is approached from values of x > 2 while G\z - 0) is evaluated when the discontinuity 
is approached in the opposite direction. It is necessary to make this distinction in 
order that the magnitude of the diseontiniuty will be determined with respect to sign. 
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or, because of (42) 

OLUliz) -}- fiU2{z) "s 0 
au[(z) + 0U2(z) -l/2p(z) 


Solving these equations for a and /S, we obtain 


a 


1 ^2 ^ _ 1 ui 

2pu(l62 — U 1 U 2 ' 2pu[u2 — U1U2 


and hence 


'^(^) = + Aui(x) + Bu 2 {x) (14-44) 

where 

f( ^ ^ 1 ru i(z)u2(x) - U2(z)ui(x)l 

2p(2) L^/ 2 ( 2 ^)^u( 2 :) - u[{z)u 2 {z)\ 

Here and in the remainder of this chapter, when two equations are given 
or when there is a choice of sign, the first always refers to x < 2 : and the 
second to x > 2 ;. The two constants A and B of (44) are determined so 
that u(x) satisfies the boundary conditions of the problem. The resulting 
function, which we henceforth indicate by (j(x,2), is Greenes function. 

We now prove that if </)(x) is a continuous function of x, then the func* 
tion which will satisfy the differential equation (39) is given by 

u{x) = f G{xyz)<i>{z)dz (14-45) 


Differentiation of (45) with respect to x gives 

• 6 


u' {x) = f ^ G' {x,z)<i>{z)dz 

u"{x) — f G" {x,z)4>{z)dz + f G"{x,z)(t>iz)dz 
+ G'ix,x — 0)<^(.r) - G'{.r,x + 0)<t>{x) 

= f'" G" (x,z)<i>{z)dz + {G'{x + 0,x) - G'ix - 0,x)](t>{x) 


= f G" {x,z)<t)(z)dz — 


<t>ix) 

p(x) 


Therefore 


pu" + pV - qv = L{u) = f ipG" + p'G' - qG)<t,{z)dz - <l>(x) 
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Requirement c causes the first term on the right to vanish. Hence, we 
have established (39) and completed the proof that G(x,z)y calculated as 
described, is the kernel of (45), and that the latter is equivalent to (39) 
and its boundary conditions. 

An important consequence of the properties of Greenes function is 
that it is symmetric. The proof proceeds as follows: Let us integrate 
the identity 

vL(u) — uL{v) = -~[p(vw' — uv')] 
ax 

This results in a relation known as Green\s formula: 

Ia = [vIAu) - uLiv)]dx = ['p{vu' - = Sa (14-46) 

Now let G{XyZi) = V] G{XyZ2) = u; and consider the three ranges a < x 
^ Zi] Zi < x < Z2] Z2 < X < b. Evaluate the integral, dividing it into 
three parts a, 21 — 5; 21 + 5, 22 — 5; 22 + 6, 5, where 5 is a small increment 
which will approach zero in the limit. 

We thus may write 

= sr‘ + 

= si - - s%tl (14-47) 

According to (46) = Sl and both must be zero, because from c, L(u) 

= L{v) = 0. This in turn requires that /„ = 0 and <So = 0 since other- 
wise Green’s function will not satisfy the boundary conditions. If in (47) 
we let 5 — > 0 and use (46) we obtain 

0 = - P (Zi ) { [y (Zi)u' (zi ) - v' (Zi +0 )m (Zi )] - [t; (zi )m' (zi ) - e' (zi - 0) u (zi )] } 
-p{Z2){b>iZ2)u'iZ2+0)-v'(Z2)u{Z2)]-[viZ2)u'{Z2-0)-v'{Z2)u(Z2)]} 

In writing these equations it must be remembered that u and v are continu- 
ous for the whole range while u' is discontinuous only at 22 and only at 21 , 
so that for example u\zi + 0) = u'{zi). Finally from (40) we obtain 

u{zi) = V(Z2) 

or 

(7(21,22) = G(Z2yZi) 

Since the points 21 and 22 are arbitrary we write in general 

G{XyZ) = G{ZyX) 

The symmetry of Green’s function is of considerable importance, since it 
permits application of the Hilbert-Schmidt theory. 

It frequently happens that the two constants A and B of (44) cannot 
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be adjusted to satisfy the given boundary conditions. In this case, a 
modified Green function^ can be found in the following way. Suppose uq(x) 
is a solution of (41) that satisfies both boundary conditions. Then cuq(x) 
will also satisfy the conditions. No loss of generality occurs if we deter- 
mine the constant so that Uo{x) is normalized, 

J ul{x)dx = 1 

and we shall suppose that this is done. We now set 

L{u) = uo(x)uq{z) 

and determine a function G{x^z) that has the same properties as we required 
of the simple Green function, except that it satisfies the equation L{G) 
= uq{x)u(s{z) instead of L{G) = 0. We finally require that 

J G(XyZ)uo(x)dx = 0 (14-48) 

The resulting modified Green function, which is symmetric, satisfies the 
inhomogeneous differential equation (39) including its boundary conditions. 
The proof of these facts is similar to that used in the case of the simple 
Green function. 

Problem. Find Green’s function for L(u) = u" with u(0) = u(l) = 0. Hint: 
letui(x) = x; U 2 (x) = 1. 

Ans.: See Table 1, sec. 14.9. 

Example. Suppose L(u) = = 0; u{l) = ?^( — 1); u'{l) = i/'( — 1). 

If we substitute the two linearly independent solutions of the preceding 
problem in (44) we see that dG(XjZ)/dx = + A, hence the second 

boundary condition cannot be satisfied. A solution of the differential 
equation which does satisfy the boundary conditions is uq = constant or 
when normalized Uo(x) = I/V 2 . Hence we seek a solution of the equa- 
tion L(u) = ?/" = uo(x)uq(z) = This is m = x^/4. Using (44) and 
the results of the last problem we see that 

(x — z) 

G{XjZ) == ± + Ax 


which gives A = —z 12 when the further condition G{x^z) = (?(— x,z) is 
imposed. Omitting the constant factor i6o(x) = I/V 2 we now determine 
B so that (48) is satisfied. This requires that 


/: 


(x - 2 ) 


dx — 


X 


* (x - 2 ) 


dx + 


x: 




dx 


® A different procedure is possible in some cases; see Ixivitt, loc. cit. 
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The result is 5 =* -J- + 2^/4, so that finally 


Ch 2 '> , (a: - 2) , (a: - z? , 1 

G(x,z) - ± -Y- + — ^ + g 


This will satisfy all of the boundary conditions. 

14.8. The Inhomogeneous Sturm-Liouville Equation. — Having proved 
that we can convert (39) to an integral equation, we wish to give explicit 
forms of the latter for different <l>(x). Suppose 


<t>{x) = \wu — x(^) 

so that (39) becomes 

L(u) + \wu = x(^) 
The resulting integral equation is 


u{x) = \ J G{XfZ)w(z)u(z)dz + g(x) 
g(x) = - J G(x,z)x{z)dz 


which is equivalent to (49) and its boundary conditions. 
x(x) = 0, the homogeneous differential equation 

L{u) + \vm = 0 

and its boundary conditions become equivalent to 
u{x) = X r G{xyz)w{z)u{z)dz 


(14-49) 


(14-49a) 


Finally if 
(14~49b) 

(14-50) 


but the kernel in this case is not symmetric unless w(x) = 1. If that is 
true (50) is a homogeneous integral equation and can be solved by the 
methods of sec. 14.3b. If w(x) 1, we may introduce a new unknown 
function 

t/(x) = u(x)"^ w{x^ 

multiply the integral equation by y/w^x) and obtain 

y{x) = X J Hix,z)y(z)dz 


where we now have a sjonmetric kernel H{x,z) =* G{x,z)'\/w(x)w(z). 
Eq. (49b) forms the basis of the Sturm-Liouville theory which was dis- 
cussed in sec. 8.5. 

Let us consider (41) and (49a) further. We write 
Liv) -H Xv = 0; L{u) = 0 


(14-51) 
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and suppose that their Green functions are known so that 

= G{x,r))\ V = r(a;,f) (14-52) 

Substitute these relations in Green's formula (46), use (40) and arguments 
similar to those which proved that Green's function is symmetric. The 
result is 

r(n,f) = ^ f G(x,v)r{x,^)dx 

For 6xed this is recognized as identical with (2) where F is the unknown, 
— fiv) and Oix^rj) is the kernel. If we now change .r, rj, { to 2 i, x, z 
and remember that the kernel is symmetric we obtain 

r(x,2;X) — G{x,z) = X r G{XyZi)V {ziz;\)dzi 


which shows by comparison with (10) that ^(x,^;X) is the resolvent of the 
kernel G(XyZi), We may thus use equations of the form of (2) or (10) to 
find the solution of either form of (51) when the appropriate Green func- 
tion (52) is known. Finally, referring to (17) and the result of Problem b, 
sec. 14.3, we see that 


D'(\) _ d In D{\) 
Z)(X) “ d\ 


f 


r{XyX;\)dx 


which will give D{\) by integration over X and hence the eigenvalues from 
the relation D{\) = 0. 


Problem. Find Green’s function for L(u) = w" k^u with the boundary condi- 
tions of the previous problem. Hint: take Mi (x) = cos A:x; M2 (x) = sin A:x. 

Ans.: See Tablet. 


14.9. Some Examples of Greenes Fimction. — For convenience of 
reference, we list in Table 1 Green's function for some important differen- 
tial equations. The following boundary conditions include those most 
often encountered: 


a. 

m(0) = u(l) = 0 

b. 

«(— 1) = w(i): w'(— 1) = u'(i) 

c. 

u'iO) = m'(1) = 0 

d. 

w(-l) = w(l) = 0 

e. 

u{0) = m'(0) = 

f. 

m( 0) = w(l) = u'(0) = m'(1) 

g- 

u(x) finite; — qo < x < oo 
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When the limits are a and fc, the appropriate Green function G{XyZ) may be 
found from our results by the transformations 


X 


X-a 
6 - a ' 


Z-a 
b — a 


(14-53) 


for if G(XyZ) is bounded by (0,1) then G(XyZ) is bounded by (a,&). The 
method of calculating Green’s function in each case is identical with that 
described in the preceding sections. When only one equation is given for 
G{XyZ) it refers to a; < 2 ; for x > 2 , interchange x and 2 . 

In addition to the results found in Table 1, Green’s function for several 
other differential equations will be given (see also Table 1 in sec. 8.5). 

For the Legendre differential equation 

L(u) = [(1 — x^)u^Y; — 1 < X < 1 


The boundary conditions are that the solutions remain finite at x = dzl. 
Green’s function is 


G(x,z) = In [(1 - a:)(l 4- 3)] + In 2 - ^ (14-54) 


The associated Legendre differential equation is 


and 


[(1 - 


rri?u 


= 0 


G{x,z) = 


2m 


[ (1 +a;)(l - g) 

1(1 - x)(l + z) 


m/2 

> 


m 7^ 0 


(14-55) 


For m = 0, the proper Green function is (54). 

The zero-th order Bessel eqmtion is L{u) = {xu')' = 0. With the 
boundary conditions m(1) = 0; w(0) finite 

G(x,z) = — In « (14-56) 

The Tirth order equation is 

(xu'Y — — u = 0 

X 

and 


with the same boundary conditions as for the zero-th order equation. 
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TABLE 1 


L{u) 

Boundary Condition 

G(x,z) 

L 


(a) 

(1 - z)x 

2. 

w" 

(b) 

\(x - z)"^ \ -\\x - z\ 

3. 

u" 

(c) 

X 

4. 

u" 

(d) 

-ill X ~ 2 1 -f- XZ - 1} 

5. 

u" 

(e) 

-3 1 a: - 2 1 + J 

6 . 

u" 

(g) 

none exists 

7. 

u" -f \u 

(a) 

sin kx sin fc(l ~ z) 

... ; - X > 0 

A; sin « 

8. 

u -f* 

(b) 

. cos A;(x - z -f 1) 

2A; sin k 

9. 

u" — \u 

(a) 

sinh kx sinh k{l — z) 

k sinh k 

10. 

u' — \u 

(b) 

01 • u 1 k(x - z + 1) 

2 k sinh k 

11. 

— \u 

(c) 

cosh kx cosh A;(l — z) 
k sinh k 

12. 

u' — u 

(g) 

— ' 

13. 


(f) 

x^iz — 1)^ 

(2xz -f X — 3z) 

6 


APPLICATION TO PHYSICAL PROBLEMS 

14.10. Abel’s Integral Equation. — One of the earliest applications of 
integral equations to a physical problem was made by Abel (1823). Con- 
sider a particle which falls along a smooth curve in a vertical plane. Let 
its original position above a given horizontal plane be Zoy its position at 
time t be z and at the end of its fall be z = 0, Let ds be the distance trav- 
elled in time dL Then if the particle moves under no force but mg^ the 
force of gravity, its velocity 

t; = ^ = ■\/2g{zo - z) (14-57) 

at 


The whole time of descent is 

r(zo) = = f 


ds 


V2g{zo - z) 


1 r"” s'(z)dz 

y/zo — z 


If the shape of the curve is given in terms of 2 , 

s = s{z) 






U.11 


LINEAR INTEGRAL EQUATIONS 


524 


then the time of descent may be calculated. The reverse problem studied 
by Abel is to find a curve for which the time T is a given function of x, 
T(zo) — f{^o) (compare the brachistochrone problem, sec. 6.1b). We 
thus wish to find 


«( 2 ) * - 



> 0 


or 


f 

•^0 


<l>(z)dz 
Vzo - z 


(14-58) 


which is a Volterra integral equation of the first kind. The presence of the 
singularity atz — Zq makes it necessary to solve the equation in the manner 
of sec. 14.2c. The details may be left to the reader. 

14.11. Vibration Problems. — a. The homogeneous siring treated in 
Chapter 7 was reduced to the eigenvalue problem (cf. eq. 7-33), 

S"(x) + k^S{x) = 0 


If we make the proper change of variable so that the boundary conditions 
are S{0) = 5(1) = 0 we see that the differential equation is similar to 
(49b), the boundary conditions lead to Greenes function (1) from Table 1 
and the resulting homogeneous integral equation is of the form of eq. (50) 
when \ and w = 1. 


b. Forced Vibrations. Suppose the string is subjected to a periodic 
force /(x) cos (0t + 5). Then if we set = 1 in eq. (1) of Chapter 8 we 
have 

^ = u" + fix) cos (fit + d) (14-59) 


with boundary conditions U{0,t) = U{l,t) = 0. We seek a solution of 
the form 

f/ = S{x) cos {pt + 5) 

which reduces (59) to 

S"ix) + el^Six) = -Six) (14-60) 


if we remember that 5(0) = 5(1) = 0. This differential equation is like 
(49) and the integral equation like (49a) with kernel identical with that 
of the homogeneous string. The integral equation may be solved provided 
/3^ is an eigenvalue and/(x) is orthogonal to the eigenfunctions of the homo- 
geneous equation. We know from Chapter 8 that the latter are sin nirx, 
hence the required condition is 



sin rnrxdx = 0 
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If /3^ is not an eigenvalue, solutions are still possible. Following the pro- 
cedure of sec. 14.8, we look for Green’s function of eq. (60) which is given 
as item (7) in Table 1. This is the resolvent of our integral equation, 
hence from eq. (10) the unique solution of (60) is 


Six) = gix) -b r rix,z)giz)dz 
^0 

g(x) = - J Gix, z)fiz)dz 


c. The Suspended Rope. Let a rope of unit length hang in its equi- 
librium position from the point x = \. If it executes small vibrations in 
a vertical plane, its equation of motion is 

_ d (xdU) 
df dx dx 

with U as its displacement. The horizontal component of its tension at x 
is x{dU/dx)j so the boundary conditions are U (1) = 0, U (0) finite. Writ- 
ing U = u(x)(l)(t) we obtain 

[xu\x)Y + k^u{x) = 0 

</)"(0 + kh{t) = 0 

The proper Green function for the homogeneous differential equation in x 
is eq. (56). 



CHAPTER 15 

GROUP THEORY 
PROPERTIES OF A GROUP 

Group theory havS become vso vital a part of modern physical and 
chemical analysis that the inclusion of its basic structure seemed inevitable 
to the authors of this book. Because of the great volume of available 
material arbitrary selection had to be made, and many proofs had to be 
omitted or given only in outline. Care has been taken, however, to insure 
that the attentive reader of the present chapter will be able to familiarize 
himself with all the tools needed for handling the simpler problems of 
group theory, such as those arising in quantum mechanics and in the field of 
molecular structure. A certain amount of material, easily obtained by the 
methods discussed in this chapter, but of somewhat lengthy derivation, has 
been collected at the end in Table 7. 

16.1. Definitions. — A grouj)^ is a set of abstract elements A, R, C, • • • 
finite or infinite in number, with a law of combination for any two elements 
A and B to form a product AB such that: 

a. Every product of the two elements and the square of every element 
is a member of the set. 

b. The set contains a unit element E for which EA = AE = A iov 
every member of the set. 

c. The associative law holds: A{BC) = {AB)C. 

d. Every element has an inverse^ X = so that AX — AA^"^ == 
A-^A = Is. 

The set of all integers, positive, negative and zero, forms a group if the 
law of combination is addition. The unit element is zero and the negative 
of every element is its inverse. These numbers do not form a group if the 
law of combination is multiplication. In this case, E = 1, but the element 
zero has no inverse hence (d) cannot be satisfied. For any law of combina- 
tion, we always speak of a product and write the two elements as if they 
were multiplied together. 

^ For general treatises on group theory, see: Speiser, A., “ Theorie der Gruppen von 
Endlicher Ordnung,” Second Edition, J. Springer, Berlin, 1927; Burnside, W., The 
Theory of Groups,” Cambridge University Press, 1927. 
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A finite group of order g contains a finite number of elements, g. A 
simple example of such a group (of order four) is furnished by the numbers 
± 1 , If n is the smallest integer for which n is called the 

order of the element X, The n elements A, A^, • • •, A”“^ X^ = E 
form the period of A, indicated by { A} . The period of a single element is 
thus a finite gi’oup ; it is called a cyclic group. 

All of the groups so far mentioned have the property that AB = BA 
for every element. When this condition is fulfilled, the group is said to be 
Abelian. Two or more cyclic groups (they arc also Abelian) may be com- 
bined to form a single group which is non-Abelian. Suppose 

A^ = £; = E] CA = (15-1) 

then the group, which we designate by D 3 (for reasons which appear later) 
is of order six with elements jE', A, C, AC, A^C. The products of these 
elements may be arranged in a multiplication table; CA, for example, is 
found at the intersection of row C and column A. If wc let A^ = B, 
AC = DyA^C = and use ( 1 ) we obtain for the group D 3 



E 

A 

B 

C 

1) 

F 

E 

E 

A 

B 

c 

D 

F 

A 

A 

B 

E 

D 

F 

C 

B 

B 

E 

A 

F 

C 

D 

C 

C 

F 

D 

E 

B 

A 

D 

D 

C 

F 

A 

E 

B 

F 

F 

D 

C 

B 

A 

E 


It should be noticed that each element occurs once and only once in each 
row or column. 

Problem a. Use (15-1) to derive the multiplication table of (15-2). 

Problem b. Show that if any element occurs more than once in a row or column 
of a multiplication table for a group then the group postulates (a) -(d) could not be 
fulfilled. 

16.2. Subgroups. — A group whose elements are contained in another 
group is called a subgroup. Thus we may always find subgroups in any 
group by forming the period of each of its elements. For example, in D 3 
a subgroup of order three is obtained from {A} = {i?} = J5, A, Simi- 
larly, three different subgroups, each of order two, may be found : { C} = E, 
C; {D} = Ey D; {F} Ey F. In addition to these subgroups, the single 
element £ is a subgroup of order one while the group itself is a subgroup of 
order six. In this case, each subgroup, except the group itself, is cyclic. It 
does not follow, however, that all subgroups are cyclic. 

Suppose a given group is of order g and a subgroup of it is of order h 
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with elements Ai, A2, • • *, A^. Now take B, an element of the group which 
is not contained in the subgroup, and form the products BAi, BA^y • • • 
BAh- These must all be in the group but none can be in the subgroup, for 
if BAi = Aj were one of the members of the subgroup then B = AjAi^^ 
would also be in the subgroup which is contrary to our assumption concern- 
ing the selection of B. We have now found 2h members of the group. If 
2h < Qy it will be possible to find a new element C contained neither among 
the elements Ai, • • •, A/i nor among the elements 5Ai, • • •, BAh. Repeat- 
ing the operations of multiplication and using the same arguments as before 
we obtain h new elements CAi, • • *, CAh. Since the group is of finite order, 
the procedure must end when we have found kh — g elements {k an inte- 
ger). It thus follows that the order of the subgroup must be a divisor of 
the order of the whole group. In the example of the preceding paragraph, 
we see that we have found all possible subgroups since the only divisors of 6 
(the order of the group) are 1, 2, 3, and 6. 

16.3. Classes. — Let A, B and X be any three elements of a group; 
then if R = B is said to be the transform of A by the element X ; 

A and B are conjugate to each other. The following properties of conju- 
gate elements may be proved; it is easy to verify them for D3 by the use 
of the group table (2). 

a. Every element is conjugate with itself. 

b. If A is conjugate with R, then B is conjugate with A. 

c. If A is conjugate with both B and C, then B and C are conjugate 
with each other. 

The complete set of elements (? = Ai, A2, • • *, Ar, which are conjugate 
with each other, is called a class of the group. If the group contains the 
elements Ai ( = R), A2, * • A^ the class of A may be found by calculating 

R~^AR = A, A2 ^AA2, , A -^AA^ 

although not all of these elements will be distinct as may be seen from the 
following example. Clearly Ci — E always forms a class by itself. In 
(2), e2 = A, By for 

ET^AE ^ A ; Br^AB ^ A) U^^AD ^ B 
A~^AA = A; C~iAC = R; F^^AF = B 

Similarly C3 = C, Z), F. By arguments similar to those used in discussing 
subgroups it follows that the whole group may be separated into a number 
of different classes none of which contain any elements in common. More- 
over, if there are h elements of a group which transform a given element into 
another element of the same class, then the number of elements in that 
class r = g/h where g is the order of the group. 
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16.4. Complexes. — A set of elements from a group, considered as a 
whole, is called a complex. If the complex fl contains A, By C then CQ 
contains CA, CB^ By the product of two complexes Gffl we mean the 
product of every element in S with every element in SB, but products occur- 
ring more than once are only taken once. By the complex 9 we mean the 
whole group. If 3C is a subgroup, then 


OCdC = ( 15 - 3 ) 

If X is an element of 9 not contained in then the complex 3CZ is called 
a right coset (Nebengnippe) and A3(' is a left coseL Cosets are not groups 
since 3CZ does not contain E. It is easy to see that if another element Y is 
neither in DC nor inDCX, then the coset DCF will contain no element common 
with DC or DCX, so that the whole group may be written as a sum of a finite 
number of cosets 


9 = DC + DCZ + DCF + DCZ + • • • 


The group may also be divided in this way by means of left cosets. In 
D 3 , we may write 


9 = DC + D('C - DC + DCD = DC + DCF = DC + CD{ = DC + 2>DC 


= DC + 


where DC = F, A, F. The index of a subgroup equals the order of the 
group divided by the order of the subgroup. It also equals the number of 
complexes obtained by splitting a group into that particular subgroup and 
its cosets; two, in the example just given. 

16 . 6 . Conjugate Subgroups. — If a subgroup DC contains the elements 
III (= E), 7 / 2 , • • •, flh then it also contains EHj = //;, H 2 Hjy • • •, IlhHj 
for every Hj in DC, and it contains HJ^E = //7^ • • •, In 

fact these arrangements of the h elements of DC are identical except for th(' 
sequence in which the members are written. Still another arrangement is 
HJ^EHj = Ej • • •, To see this, sort out the arrange- 

ment EHj = //j, • • •, HhHj so that the natural order //i, H 2 , • • *, ///i is 
regained and multiply each element by A similar argument will 

show that for X, any member of the group (not necessarily contained in DC) 
X"~^DCX is also a subgroup, but X“"^DCX and DC, called conjugate subgroups, 
may be different if X is not in DC. When DC and X“^DCX are identical for 
every X in the group, DC is called an invariant subgroup or a normal divisor. 
To illustrate these statements choose DC = E, C and DC = E, A, B from 
D3. It is easily verified that the only invariant subgroup of D3 is DC = E, 
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A, B. The invariant subgroup and its cosets form a group^ called the 
quotient {ov factor) group with the invariant subgroup as unit element. In 
D3, if y = !}fC, then the multiplication table of the quotient group 9/3C is 



3C 

7 

3C 

X 

y 

7 

J 

DC 


(15-4) 


16.6. Isomorphism. — Two groups 9 and 9^ are said to be simply iso- 
morphic if to each element 4, C, * * • of 9 there corresponds an element 
A \ B\ C\ ' ' of 9' so that if AB = C, then A' B' = C' for every product. 
In the general case, two or more elements of one group may be isomorphous 
with a single element of another group. Thus the quotient group (4) is 
multiply isomorphous with D3, for 3C corresponds to A, B and 7 to 
C, D, F, 

In order to find a group which is simply isomorphous with D3, we con- 
sider the n\ permutations of 71 symbols. By {ached) we shall mean a 
replaced by c, c replaced by 5, hhy e, ehy d and d by a. This may also be 
written as (bedac) or {dacbe) as long as we do not change the cyclic order of 
the symbols. When a single letter occurs in a parenthesis, that letter is 
unaffected by the permutation, hence we will write {bce){a){d) as (bee). 
By the product of two permutations, we mean the permutation directed in 
the left parenthesis followed^ by the permutation in the second parenthesis. 
For example in the product {ached) {bce)y a is replaced by c and then c by c, 
the net result for a being that it is replaced by e. Continuing in this way 
we obtain {ached) {bee) = {aed). If we use only three letters and write 

E = (a)(6)(c) B = (ac6) D = (ac)(6) 

A = {abc) C = {a) {be) F = (a6)(c) 

the resulting operations form a group which is simply isomorphic with D3, 
for 

AB = {abc){acb) = (a)(6)(c) = E 
BC = (ac6)(6c) = {ab) = F; etc. 

Problem. Derive the complete multiplication table for the group of permutations 
on three letters. 

^ Note that the elements of the quotient group are complexes, i.e., collections of the 
original elements of 9. 

^ Note that we apply the permutation from left to right. The opposite convention 
is often used. 
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16.7. Representation of Groups.^ — If to every member of a group 
Aiy A2, ^3, • • we can associate a square matrix D(Ai)y D{A 2 )y D{Az)j • • • 
in such a way that if A{Aj = Ak and D{Ai)D(Aj) = D{Ak)y then the 
matrices themselves form a group isomorphous with 8. Such matrices 
are a representation of the group; their order is the degree or dimension of 
the representation. One trivial example of a representation is the unit 
matrix E associated with every element of the group. A representation 
for D3 may be obtained from its quotient group if we associate X with the 
matrix [1] and 7 with the matrix [— 1]. 

To find another representation of D3, let us think of the symbols 
a, 6, c as the components of a vector x and the elements of the group as 
operations which change x into a new vector x' with the same components 
but in a different order. Hence the required representation D will be a 
matrix such that x' = Z)x where the rows and columns arc labelled with the 
components a, 6, c. Now E is the operation which replaces each component 
by itself so D{E) is the unit matrix. On the other hand, A changes a to 5, 
etc., so unity will appear in D{A) at the intersection of the a-th row and 
the 6-th column, etc. Continuing in this way, we find 


D{E) = 


D(C) = 


'l 

0 

o" 


"o 

1 

0 " 


"o 

0 

r 

0 

1 

0 

; d{A) = 

0 

0 

1 

; d{B) = 

1 

0 

0 


0 

1_ 


_i 

0 

o_ 


_0 

1 

0 _ 

'1 

0 

0‘ 


"o 

0 

i' 


"o 

1 

o' 

0 

0 

1 

: d{D) = 

0 

1 

0 

; D{F) = 

1 

0 

0 

,0 

1 

0 



0 

o_ 


_0 

0 

1 _ 


(15“5a) 


By multiplying the matrices together, it will be seen that the multiplica- 
tion table (2) is reproduced. For example, D{A)D{B) — D{E) and 
D{A)D{C) = D{D). Thus (5a) is a representation of D3. 

Suppose a representation of a group has been found, consisting of 
matrices D ~ D{Ai), D{A 2 )y • • •, D{Ag), each matrix being of dimension 
n. Then it is often possible to find a new coordinate system, i.e., a trans- 
formation of the type Q^^DQ, such that every matrix D is changed to the 
form 



(15-6) 


where Di is of order m < n and Do is of order (n — m). Under these 
conditions, the representation D is said to be reducible into Di and D 2 . 

^ The reader will observe that considerable attention is paid here to representations ; 
the reasons will become clear when sec. 15.19 is read. The mathematical theory of 
representations is discussed by Mumaghan, F. D., “ The Theory of Group Representa- 
tions,” The Johns Hopkins Press, Baltimore, 1938; Weyl, H , “ The Classical Groups,” 
Princeton University Press, Princeton, 19.39. 
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We now examine Di and D 2 to see if they are reducible, continuing until D 
is completely reduced. When this has been accomplished, we will have a 
relation between the original and final coordinate systems such as z = Qx 
and 

Q-^DQ = diag[r'\ = r (15-7) 

where the are themselves matrices. 

It should be understood that if there are g elements in the group, there 
will be g equations like (7), one for each element; D means the set of g 
matrices in the original coordinate system and F means the same matrices 
in the new coordinate system. Suppose there are s irreducible representa- 
tations in (7), F^^\ F^^\ • • •, F^*^; each one of these is a set of g matrices, 
one for each element of the group, F^^^ = F^^^(ili), • * •, 

Each F^^^ is isomorphous with the corresponding D in the original coordi- 
nate system since the two sets of matrices are related to each other by a 
collineatory transformation (cf. sec. 10.11). 

It may happen that some F^-^^ may appear more than once or not at all 
in the reduction of a given representation. To indicate this, we rewrite (7) 
as 

F = ciF('> + C2F(2) + . . . + 

where the c’s are positive integers or zero. Such an expression, called the 
direct sum, is not meant to imply that the F^-'^ are to be added. It is 
simply a shorthand method of showing that the matrices D have been re- 
duced to the form (7). 

It is of considerable advantage to choose unitary or orthogonal matrices 
as the representations of groups and we shall suppose that this is always 
done. Under these conditions the following statements may be proved.^ 
Two irreducible representations will be orthogonal, and if dj is the dimen- 
sion of then 

rrSrii’* = T7-7TT72 (15-9) 

A ydtdj) 

the summation to be made over the g elements of the group, i4i, A 2 , • • •, Ag. 
Moreover if there are s classes of elements in a group, there will be exactly 
8 different irreducible representations and 

di + ^2 + ' ’ • + d? = S' (15-10) 

It is not always possible to obtain all s of the irreducible representations 
from a single set of reducible matrices D since some of the Cj in (8) may be 
zero. If this is the case, another set of matrices D' must be found and 
these must be reduced in the same way until the complete set is obtained. 


® Mumaghan, loc. cit., Chapter III, 
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16.8. Reduction of a Representation. — We now wish to show how it is 
possible to find all of the irreducible representations for D 3 . Since g = 6 
and s = 3, it follows from (10) that they are of dimension 1 , 1 and 2. To 
find the two representations® of degree one, we consider the quotient group 
(4) with two classes, containing and containing Its two 
representations are = 1 ; = 1 ; = 

— 1 . While these are almost trivial, it is seen that they satisfy all of the 
requirements for a representation of D3. They are therefore taken as its 
two representations of degree one. 

In order to obtain the representation of dimension two, we attempt to 
reduce the matrices of (5a). No general rule can be given for effecting such 
a reduction although geometric considerations may often suggest the correct 
procedure. In this case reduction is accomplished by using the orthogonal 
matrix^ 




1/V3 

i/V'i 

i/v^ 


1/V2 

-1/V2 

0 



(15-11) 


For D{A)y we obtain 


Q-^D{A)Q = 


1 

0 

_0 


0 0 
- 1/2 V3/2 

-V^/2 -1/2 


The other matrices D{B)j I>(C), D{D) and D{F) may be reduced in the 
same way. It will be found that besides the two-dimensional representa- 
tion, we again get F^^^ = F*P^ = 1. We now have all of the irreducible 
representations. 


= F^^^(C) = = r^^^{F) - i 

F^2) = r(2) ^ p(2) ^ I . p(2) _ p(2) ^ p(2) (/T) = _ 1 


r«(£)-[J "]; 

■ [v3/2 


- V 3/21 

-1/2 J 


V'd/ 2 ' 
- 1 / 2 J ’ 


® The reader will recall that the complex must be regarded as a single element of 
the quotient group 8/3C. It is true that 5fC is made up of the elements Ey d, and B of 
the original group D 3 , but it acts as the unit element of 8/5fC. The other element 
of the quotient group, JJ, contains the elements C, D, F, of Ds. 

^ The argument leading to this particular form is given by Bauer, “ Introduction 
a la th^orie des groupes,^’ Paris, 1933, p. 79. 
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-'»-US :S’]' 

;] 

The orthogonality relation (9) is readily verified. 

Problem. Show that the irreducible representations of D 3 are orthogonal as re- 
quired by eq. (9). 

16.9. The Character. — The task of finding all the irreducible repre- 
sentations of a given group is usually very laborious. However, for most 
physical applications, it is sufldcient to know only their trace, a quantity 
called the character^ in group theory. We shall indicate the trace of 

A further simplification is afforded by the 
fact that elements in the same class are obtained from each other by a 
similarity transformation, hence the character of every element in a single 
class is identical. Therefore if we know all the characters of one element 
from every class of the group, we have all of the information concerning the 
group which is usually needed. We shall indicate the particular class to 
which we refer by a subscript, so that the s characters xi\ X 2 \ * * *> xl^^ 
refer to the f-th irreducible representation. All of the characters are col- 
lected in Table 1 where the column indicates the class, and the row indicates 
the irreducible representation. 


TABLE 1 




€2 

e. 

p(l) 


x^‘> 

xii> 

r(2) 

Xl'^’ 

xf > 


p(») 


xf 

xV> 


The following properties of the characters may be derived® or verified 
using tables of characters given in later sections. 

a. The class Ci = £ is always represented by the unit matrix, thus 
Xi*^ equals the dimension of the representation and hence must be a divisor 
of the order of the group. We also see from (10) that 

£ [x{‘f = g (15-12) 

® The character (especially of permutation groups) is treated in detail by Little- 
wood, D. E., “ The Theory of Group Characters,” Oxford University Press, 1940. 

® Cf. Speiser, loc. cit.. Chapter 12. 
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If g and s are known, it will usually be found that there is but a single way 
in which this equation can be satisfied. 

b. From (9) it follows that the 8 characters also form an orthogonal 
system. Summing over the classes we obtain 

(16-13) 

where Vq is the number of elements in the ^-th class. 

c. If E is the character of a reducible representation, then from ( 8 ), 
we have 

E = + • • • + (15-14) 

On multiplying this by x^^^* and summing over we obtain, using (13), 

C, = - (15-15) 

When the complete multiplication table for a group is known, the follow- 
ing procedure^^ may be used to obtain the characters. First calculate the 
product of all elements in the class Cj by all elements in It will be 
found that the resulting set of elements may be uniquely arranged in classes 
and that the same results are obtained irrespective of whether we multiply 
by Ck or the reverse. Now a given class may occur in the products 
several times or not at all. Let us use hikj to indicate the number of times 
the j-th class appears. Then if we abandon our earlier rule for the multi- 
plication of complexes (cf. sec. 15.4) and take each element of the product 
as many times as it occurs, we may write 

eA - = i:hik,A 

where we sum over the total number of classes, &*. Having found the num- 
bers hikj it is then possible to find the characters from the relations 

s 

'T^kXiXk ~ Xl '^btkj'T jXj (15—16) 

J = l 

where Vi is the number of elements in 

As an example of the use of this equation, we find for D 3 

el = AB, BA = 2ei + €2 

^3 = sCi + 3e2; e2e3 = 2e3 

Proof of the statements in this paragraph be found in Mumaghan, p. 88 or 
Speiser, p. 170, loc. cit. They may be verified by using the multiplication table for D3. 
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(The other products are not needed.) Since ri = 1; r2 = 2; ra = 3, we 
have 

4x2 = XI (2x1 + 2x2) 

9x3 = xi(3xi + 6x2) (16-17) 

6 X 2 X 3 = 6X1X3 

From (12), we know that xi has the values 1, 1 and 2. Solving (17) with 
each of these quantities in turn we obtain the entries in Table 2. They are 
identical with the trace of the matrices of the last section. 


TABLE 2 



(?i 

^2 

e, 

pd) 

1 

1 

1 

p(2) 

1 

1 

-1 

p(3) 

2 


0 


Let us apply eq. (15) to the matrices (5a) and confirm a fact that we 
already know, namely, that these reducible representations contain 
and once each but not From (5a), we see that Si = 3; E2 = 0; 

S3 = 1, hence, using eq. (15), 

Cl = (1.3.1 + 2.0.1 + 3.1.1)/6 = 1 

C2 = (1.3.1 + 2.0.1 - 3.1.1)/6 = 0 

C3 = (1.3.2 - 2.0.1 + 3.1.0)/6 = 1 

We have shown how a reducible representation of the group D3 may be 
found (cf. sec. 15.7). Now elements occur on the diagonal of the matrices 
of eq. (5a) only when the symbols are unchanged by the permutations with 
which D3 is isomorphous. Since these diagonal elements are all unity, the 
reducible character of an element of a permutation group is equal to the 
number of symbols unchanged by the permutation. This result is very 
useful, for every group is isomorphous with some permutation group; 
hence when the latter is known it is a simple matter to find S^. 

Problem. Derive Table 2 by the method described in the text. 

16.10. The Direct Product. — ^Two cyclic groups were combined in (1) 
to form a single larger group. We now describe another method of aug- 
menting the order of a group. Suppose 8' is of order m with elements Aij 
A 2 , * • *, Am and is of order n with elements 5i, B 2 , • • •, Bn and that 
every A commutes with every B. Then the mn elements AiBj form a group 
of order mn called the direct product of 9' and 8 = 8' X 9". If the 
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matrices r (A) and r(fi) are irreducible representations of Q' and then 
their direct product 

r(A) xr(S) = r{AB) 

is a representation of 8. Moreover, if is a character of , in 8' and 
belongs to (St in then the at characters of in 8 are given by 


If one or both of the representations F (A ) and r{B) are of the first degree, 
the direct product F (AB) is irreducible. If both are of degree higher than 
one, F(AJ5) is reducible. The reduction is very simple provided the table 
of characters for both groups is known, for multiplication of one set of 
characters by another will give a sum of characters already contained in 
the table. This can always be uniquely resolved into its component parts. 
An illustration of such reduction will be given in sec. 15.18. 


SOME SPECIAL GROUPS 

16.11. The Cyclic Group. — If a cyclic group is formed from {A}, 
A^ = E and X is any element of the group defined by A" = A"^, m = 1, 2, 

• • •, n, then X'^^AX = A. It thus follows that every element of a cyclic 
group or any other Abelian group is in a class by itself. Aloreover, we see 
from (10) that the n irreducible representations will each be of degree one 
so that each representation is also a character. Now if € = exp(27rt7n), 
then € will be a representation and a character for A and will be a charac- 
ter for A^ (m = 1, 2, • • •, n), since these n numbers will satisfy the multi- 
plication properties of the group elements. Moreover, will also serve 
as a set of characters for the same reason. In fact the n distinct powers of 
(m = 1, 2, • • •, n) will give the n characters for each of the n elements. 

Tiicj aic aiiuwii iii xttuic o. vvc v:a.ii »uuii a tauit; using 

de Moivre’s theorem: = cos 27rp/n -f i sin 27rp/n. For example, if 

n = 4, the only numbers that will occur are dbl and it. 


TABLE 3 



tl 

II 

II 

€ 

ii 


T 

e 

II 

pd) 

1 

1 

1 


1 

p(2) 

1 

€ 



^n-1 

p(m) 

1 





r(n) 

1 

^n-l 



^(n— D* 
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16.12. The Symmetric Group. — Consider a particular permutation of 
five letters which we write as 

abode 
c e b a d 

This is to be interpreted as meaning: a is replaced by c, 6 by e, c by 6, 

d by a, and e by d. A more convenient and equivalent form for such a 

permutation is 

Pe = (acbed) 

which we have already used in sec. 15.6. The one-line form is called a 
cycle; its degree equals the number of letters in the parenthesis. It will be 
found that any permutation may be written as a single cycle with no letter 
repeated, or as a product of two or more cycles, none of which has a letter in 
common. Provided their proper sequence is retained, the letters in a cycle 
may be rearranged, but the number and degree of all cycles corresponding 
to a given permutation is unique. For example, 

Po == I ^ ^ f ^ = (ac)(bed) = (bed)(ac) = (dbe)(ca)^ etc. 

\c e a b d/ 

A cycle of degree two is called a transposition. A cycle of higher degree 
may be rewritten as a product of two or more transpositions in several 
different ways, but then the product will contain the same letter or letters 
in two or more parentheses. However, if the original cycle contained an 
even number of letters, the product of transpositions will be composed of an 
odd number of transpositions and if the original cycle contained an odd 
number of letters, the product will have an even number of transpositions. 
Since any permutation may be decomposed into a product of cycles, and 
each of the latter may be written as a product of transpositions, it follows 
that any permutation may be factored into a product of transpositions. 
Moreover, all the different products corresponding to a given permutation 
contain either an even number or all contain an odd number of transposi- 
tions. This property of a permutation is unique, and permits us to speak 
of even and odd permutations, Pe and Pq* As examples, we see that 

Pe = {ac){ab){ae){ad) = {ac){cb){be){ed)^ etc. 

Po = (ac)(b6)(ed) = {ca){be){bd)^ etc. 

The symmetric group of orderni is defined as the group of all permuta- 
tions, both even and odd, of n letters. The set of n!/2 even permutations 
of n letters forms a subgroup of the symmetric group, of order n!/2; it is 
called the alternating group. A simple consideration shows it to be an 
invariant subgroup. The odd permutations contained in the symmetric 
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group do not alone form a group, since the product of two odd permuta- 
tions is even. However, the complex of odd permutations is one of the 
elements of the quotient group of order two which is isomorphous with 
the symmetric group, the other element being the complex of even per- 
mutations. 

Problem a. Construct elements and group table for the symmetric group on four 
letters. Decompose all elements into transpositions. 

Suppose a permutation has been factored into a cycles of degree one, 
^ cycles of degree two, etc. We describe this arrangement by the symbol 
• • •) which is called a partition. It is easy to sec that any permu- 
tation P and its inverse will belong to the same partition, for 
is formed from P by reversing the order of the letters in the cycles of P. 
Thus 

Po = {ac)(bed); P(7^ == {ca){deb) 

It is also true that elements in the same class belong to the same partition 
and that there arc as many classes as partitions (cf. Problem a). Now if 
the total number of letters in a permutation is n, we must have 

a. -f" 2/3 -f- 3^ ‘ = n 


hence the number of possible partitions or the number of classes equals the 
number of distinct solutions of this equation in positive integers or zero. 

In order to find the number of elements in a class we must find the 
number of permutations having the same cycle structure. Suppose there 
are n letters and that the particular class under consideration belongs to 
the partition * ■ •)• There are n\ ways of arranging the n letters 

but not all of the arrangements will lead to n, different permutation. For 
instance, we may start a given cycle with any letter in it; i.e., (a6c), {hca) 
and {cab) are identical. This fact means that 1“2^3'^ • • • arrangements will 
differ only by cyclic permutation within the various cycles. There is still 
another possibility of duplication. It does not matter whether we write 
{ah){cd) or (cd)(ab), hence there are a!/3!y! • • • interchanges of this kind, 
each corresponding to the same permutation. We thus conclude that the 
number of different arrangements or the number of elements in a class 
symbolized by the partition (1“2^3^ • •) equals 


n! 


(16-18) 


Application of the methods just described will show that for n = 4, 
there are 5 classes corresponding to the partitions (1“*), (1^,2), (1,3), (2^), 
(4). Typical elements of each class are E = (a) (b) (c) (d) ; (ab); {abc) \ 
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(o6) (cd) ; (abed). The number of elements in each class is 1, 6, 8, 3 and 6, 
respectively. The complete class of (2^) is ^4 = (a6)(cd); (oc)(6d), 

(ad) (be). 

Problem b. Verify the statements of the preceding paragraph. 

Two irreducible representations of the symmetric group are found 
immediately from the quotient group, for if the even and odd classes are 
indicated by and C~, we have^* 

r»>(e+)-r<'>(e)-i 

f<‘>(e)-i; P™(c-)--i ' ' 


All other irreducible representations are of higher degree. From each one 
of these (and also from as shown in (20)), a new representation called 
the associated representation can be obtained by forming the direct product 

r(i) X (15-20) 

Both and F^^^ have the same dimensions and (F^*^^) = F^-'^ If 
the two representations are self-associated. Since F^^^ = +1 
for even classes and — 1 for odd classes, it follows that 


-(y)(e+-) = ^(y)(e+-) 
xO)(e-) = -x^^*>(e-) 


(15-21) 


In order to satisfy (21), the character of C~ for a self-associated represen- 
tation must be equal to zero. 

Provided n < 5, a simple method may be used to obtain the complete 
table of characters for the symmetric group. When n > 5, this procedure 
will not give the characters for all the classes but actually it still gives the 
characters which are of interest for physical problems. The restriction 
on n is not a defect of the theory, since eq. (22) which follows is a simpli- 
fied form of the general polynomial which applies for any value of n. 

Suppose there are in a given class of the group p cycles of degrees 
Xi, X 2 , • • Xp with Xi + X 2 + • • • Xp = n. Then is the coefficient of 
X*, k < nf2y in the polynomial 

(1 - x)(l + x^^)(l + x^) . . . (1 + x^p) = Lx^^^x* (15-22) 

k 

The coefficients of the highest power of x*, that is, /c == 1 for n = 3 and 
fc = 2 for n = 4, are the characters of the self-associated representation. 

We originally denoted by the symbol It is convenient here to use a 

different notation in order to show the relation between F^^^ and F^^X 

For proof of this statement and a derivation of the method with n < 5, see Wigner, 
E., “ Gruppentheorie und ihre Anwendung auf Quantenmechanik der Atomspektren,” 
Braunschweig, 1931, Chapter XIII. 
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We thus obtain from (22) the characters of k representations of the group 
while those of the remaining (s — k) representations are the associated ones 
which may be obtained by using (21). We illustrate the procedure for the 
symmetric group of order 4!. 

For the partition (1^), Xi = X2 = X3 = X4 = 1 and the polynomial is 
(1 — x)(l + x)'^. Since k < n/2y we take the coefficients of x and 
which are 1, 3 and 2. The last value, 2, is the character of a self-associated 
representation as previously pointed out. The class under consideration 
is even, hence the associated characters are 3 and 1, completing the first 
column of the character table. For the next class (1^,2), we have 
Xi = X2 = 1, X3 = 2; the polynomial is (1 — x)(l + x)^(l + x^) and the 
coefficients of x^\ x and are 1, 1, 0. The clavss is odd and the associated 
characters are —1, —1. The remaining polynomials are (1— x)(l+x) 
(1 + x^); (1 — x)(l + x^Y and (1 — x)(l + x^). All of the characters 
are given in Table 4. We have added the number of elements in each class 
and indicated by signs the even and odd classes. 


TABLE 4 


("lass 



(1,3)+ 

(2®)+ 

(4)- 

No. of Elements 

1 

6 

8 

3 

6 

r<u 

I 

1 

1 

1 

1 

p(2) 

3 

\ 

0 

- 1 

- 1 

r(3) 

2 

0 

-1 

2 

0 

p(4) ^ p(2) 

3 

-1 

0 

-1 

1 

p(6) ^ p(l) 

1 

-1 

1 

1 

-1 


16.13. The Alternating Group. — If two elements Ai and Aj of the 
symmetric group are in the same class, it does not follow that they will 
belong to the same class of the alternating group. Any even class of the 
symmetric group which contains none or one cycle of odd order or no cycles 
of even order will split into two classes in the alternating group, each of the 
new classes containing half as many elements as it contained in the sym- 
metric group. For example A and B of (5) belong to the same class of the 
symmetric group with n = 3, but to different classes of the alternating 
group, as may be verified from (2). 

The characters of the symmetric group which are not self-associated are 
also characters of the alternating group. Every character of a self-associ- 
ated representation is the sum of two equal characters for the alternating 
group except for the two classes which have been obtained by splitting a 
class of the symmetric group. Thus if n = 3 or 4 and the character table 
is known for the symmetric group, we can fill the character table for the 
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alternating group except for four blank spaces. Suppose the two classes 
whose entry in the table is blank are obtained from the partition 
(Xi,X 2 ,X 3 ,- • •)• Then if m == XiX 2 X 3 • • *, the character 
occur in the symmetric group at the intersection of the row corresponding 
to the self-associated representation and the column of the class in question 
while in the alternating group we will have 


2 


(15-23) 


The two remaining vacant places in the table are filled by interchanging 
the two characters given by (23). 

For n — 4, there are 4 classes since (1,3)^ splits into (1,3)' and (1,3)". 
The self-associated representation is Its characters become (l,x,y,l) 

and (l,y,a:,l) where x and y obtained from (23) are ( — 1 it i\/3)/2 since 
Xi = 1, X 2 = 3, M = 3. Writing c = exp(27rz/3), we thus have x = e, 
y = €^. This completes the calculation as shown in Table 5. 


TABLE 5 


Class 

(1') 


(1,3)" 

(2=“) 

No. of Elements 

1 

4 

4 

3 

pd) 

1 

1 

1 

1 

p(2) 

3 

0 

0 

- 1 

p(3) 

1 

€ 


1 

r(4) 

1 

6" 

e 

1 


15.14. The Unitary Group. — The collection of all noii-singiilar matrices 
of order n, with matrix multiplication as the law of combination, is the 
representation of a group called the /uM linear group (FLG). The order 
of the group is infinite, for its elements are the infinite number of linear 
transformations that change a vector x into a new vector. This group has 
many subgroups obtained by imposing certain restrictions on the matrices 
of its transformations. Thus, we might exclude all matrices except those 
with determinant equal to ±1 or we might require that the matrices be 
orthogonal. Such groups are discrete^ if the elements are infinitely denu- 
merable (an example of a discrete group of this type is given in sec. 15.1); 
continuous y if the elements are non-denumerable. An example is the 
group of rotations about an axis. One may also have rnixed-cmtinuous 
groups such as R^(2) discussed in sec. 15.16. Infinite groups have many 
of the properties of finite groups, although naturally some modifications^^ 
in their treatment are necessary. 

See, for example, Wigner, loc. cit., Chapter X, 
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We first consider a subgroup of FLG, which is called the two-dimensional 
unimodular unitary group (SUG, special unitary group). Its elements are 
square unitary matrices of order two with determinant of +1. Let us 
take a matrix 


’a iT 
d_ 


and modify it so that these conditions are met. Referring to eq. (10-50), 
we see that we must have c = — 5* and d = a*. Thus a typical element of 

SUG is 




bb 


— 


When this matrix is applied to a column vector x 
Ux = x', we have 

= a.ri + 6x2 
X2 - -b*xi + a*X2 


(15-24) 
so that 

(15-25) 


It will also transform any function of x into a linear combination of Xi, 
X2; for example, 

Ufix) = fix') - /(axi + 6x2, -5*ri + a*X2) (15-26) 

Thus if J7 operates on a set of (7^ + 1) homogeneous products 

= (7> = 0, 1,2, •• •, n) (15-27) 

the result is a homogeneous polynomial of the same degree 
= (oxi + bx 2 V{-b*x, + a*X 2 r-’’ 

= (15-28) 

k^o 


Clearly, the two-dimensional matrices U are themselves representations 
of SUG. But the matrices with elements Upk\ being isomorphous with U 
because of eq. (28), must also be representations, provided we can show 
that they are unitary. As a matter of fact, they are not unitary, but if 
each element is multiplied by [p\{n — p)l]~^^^, they become so. Multipli- 
cation of the elements by this constant factor is, of course, equivalent to 
multiplying by the same quantity. When we do this, we find it con- 
venient to set n = 2j ; p — j + m. The purpose of the latter substitution 
is to enable us to prove in sec. 15.15 that SUG is isomorphous with the 
three-dimensional rotation group. 

When these changes have been made, becomes 

fU) = (15-29) 

V (;■ + m)\(j — w)! 
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where j = 0, 1, f, • ■ m - —j, —j + I, ■ ■ ■, j — I, j. At the same 

time, eq. (28) becomes 

UfU) ^ (oxi + bx2y'^(-b*xi + a*X2y~^ 

V {j + m)!(j — m)\ 

= X (15-30) 

-J 

The resulting matrices whose elements are UiH will be indicated by 
They are unitary and irreducible; furthermore, there are no other irreduci- 
ble representations^^ of SUG. 

In order to obtain the elements of we develop (30) by the binomial 
theorem and pick out the coefficient of fq\ It is found to be 

rj(j) ^ y + m)\{j - m)!(i + q)\{j - g)! 

+ ty.it- q + m)H\ 

X (15-31) 

In this expression, t takes the values 0, 1, 2, • • • and the summation breaks 
off automatically when negative powers of the a^s and b’s appear because 
the denominator will then contain the factor ( — ) ! which is «> . 

Since m and q have {2j+ 1) possible values, it follows that the matrices 
of the^representations have dimensions of (2j +1). If j = 0, = 1. 

If i = ^ q can take the values it^, hence if the elements of the 

matrix are characterized by and — in that order, we have 
identical with U of (24). 

In order to determine the characters of SUG let us select a typical 
matrix of the group and transform it to diagonal form. A unitary trans- 
formation is required and it is certain that among the infinite number of 
unitary matrices in the group, one may be found, say K, that will effect the 
diagonalization 

Finally, since we require ] J/j ] = 1, the coefficients of Vi may be deter- 
mined*® 

re'-*/* Q 1 

- Lo 

All other matrices of the group belong to the same class as U and Ui, 
for the class is composed of elements which are obtained from each other by 

The proof of these facts will be found in Mumaghan, loc. cit.. Chapter 3 or Wigner. 
loc. cit., Chapter XV. 

The reason for choosing instead of will become apparent in sec. 15.16. 
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similarity transformations or, in tliis particular case, by unitary transfor- 
mations. Since each matrix is unitary it remains unitary when it under- 
goes such a transformation. We also know that it is only necessary to 
calculate the character of one element from a class; thus using U\ 

^(1/2) ^ ^ ^-10/2 


is the character for the representation of degree (2j -b 1) = 2. 

Now the matrices which are identical with Ux when j = must 
be transformable in such a way that th(' characters will be identical for 
j = I, and when this is done the characters should apply to SUG for any 
value of j. If we substitute a = 6 = 0 in (31), the result is of diag- 

onal form since all elements disappear unless t = 0 and m = q, 

(15-34) 


The required characters for SUG, infinite in number, are thus 


m =* -> 


(15-35) 


A simpler form of the last expression may be obtained as follows, 
p = so that 


^(;) 4. p + p2 ^ ^ p2,) 




(1 - 

(1 -p) 


Let 


Multiply numerator and denominator by e and use the relation 
sin X = 1 ( 6 “'""' — c'®)/2; then 


sin (2j -f 1)0/2 
sin 0/2 


(15-36) 


The irreducible representations and characters satisfy certain orthog- 
onality and normalization conditions^® as in the case of finite groups, but 
the summations in (9) and (13) are replaced by integrals. 

16.16. The Three-Dimensional Rotation Groups. — Another important 
subgroup of FLG (as well as of the n dimensional unitary group) is the 
n dimensional full, real orthogonal group which consists of all unitary 
matrices with real elements. If we further restrict this subgroup, choosing 
all real unitary matrices with determinant equal to +1, we have the n- 
dimensional proper, real orthogonal group or the rotation group. It should 
be remembered that an orthogonal matrix need not be unitary but a real 
orthogonal matrix and a real unitary matrix are synonymous terms. For 
the moment, we consider the three-dimensional rotation group R“‘"(3) whose 
elements are real orthogonal matrices of order three. 

See Wigner, loc. cit., Chapter XV or Eckart, Carl, Rev. Mod. Phys. 2, 344 (1930) 



16.16 


GROUP THEORY 


54G 


Assume that we have a sphere of unit radius, the center of which coin- 
cides with the origin of a coordinate system OXYZ fixed in space. Now let 
the coordinate of some point on the surface of the sphere be {x^y^z) and 
rotate the sphere in any manner whatsoever leaving its center fixed. The 
new coordinates of the point {x'jy\z') will be related to {x,y,z) by some 
matrix /?(a,/3,7) which is an element of R‘^(3). As we have shown in sec. 
9.5, such a rotation may be factored into a product of three plane rotations 
described by the Eulerian angles (ayfi^y ) ; i.e., we may write 

R(a,0,y) = R,iy)Ry(fi)R,(a) (15-37) 

where and Ry are rotations about the Z- and F-axes respectively. 

In oi'der to find the representations of R'^(3) we could use a method 
similar to that' of sec. 15.14 and study the effect of transforming a function 
of (x^yjZ) by the elements of the group. A simpler method^^ is available 
for we will show that R“^(3) is isomorphous with SUG. Since we know the 
representations of the latter, we may use the same results for R'^(3). We 
recall, however, that the elements of SUG are two-dimensional matrices 
while the elements of R^(3) are three-dimensional, hence the proof of the 
isomorphism depends upon finding some relation between these two kinds 
of matrices. The problem is an old one which occurred in classical mechan- 
ics; it was solved by Klein and by Cayley, who made use of a special kind 
of transformation in the complex plane. We prefer to proceed in another 
way. 

We first observe that any two-dimensional matrix may be written as a 
linear combination of the four matrices^^ 



For example, if 


we may write 
where 

Cl 

C3 


r 0 z 

1 .. fl 

O'! 


“1 01 

0 

J’ "“Lo 

-.J 

; P 4 = 

.0 1] 

(15-;: 

H = 

\Hn Hii] 

U 21 // 22 J 





H = CiPi + C2P2 + C3P3 + C4P4 


{H 21 + Hr2)/2; = i{H2i - Hi2)/2 

(Hu — H 22 )I 2 ] C4 = (Hu + ^ 22)72 


Both methods are discussed by Wigner, loc. cit., Chapter XV. 

The details are given by Whittaker, E. T., Analytical Dynamics,” Third Edi- 
tion, Cambridge University Press, 1927, p. 12. The quantities a and b which appear 
in our eq. (24) are identical with the Cayley-Klein parameters. Eckart, loc. cit., and 
Bauer, loc. cit., have used a similar method in the group theory problem. 

The first three of these are the Pauli spin matrices, discussed in sec. 11.29. 



547 


THK THUKE-DIMKNSIONAL KOTATIOX GKOUi'S 


16.15 


Let us take ci = x, C 2 = y, C 3 = z, C 4 = 0 . Then we have, 

H{x,y,z) = xPi + yPi + zPs 

.[ V ’■•*•'■‘'1 (16-39) 

lx - ly -z A 

Clearly if x, y, z are real, H is Hermitian. Moreover, its trace is zero; 
in fact, any two dimensional matrix with trace of zero may be put into this 
form, P 4 not being needed. If H is now subjected to a unitary transforma- 
tion by the matrix U of (24) its trace is unchanged and we obtain 

H'(x\y'/) = W HU = x'P, + y'P 2 + z'P^ (15-40) 

If we can prove that the relation between x, ?/, z and x\ y\ z is a rotation, 
we may conclude that the matrices U oi SUG perform the same transfor- 
mations as the matrices of the group R'^(3) and that the two groups are iso- 
morphous. To do this, we note (see the problem in sec. 10.14) that 
1 // 1 = 1 1 , hence 

+ ^2 ^ ,2 ^ ^/2 ^ ^/2 + ,' 2^1 

which means that the length of a vector is unchanged by the transformation 
of eq. (40) and the latter must be a rotation. 

Let us study some special forms of the matrix U whose general form is 
given by eq. (24). We first put a = 6 = 0, that is, we use I/i, the 

diagonal matrix of eq. (33). We easily find 

U\PiUi = cos aPi — sin aP 2 

UIP 2 U 1 = sin aPi + cos aPi (15-42) 

UlP-sU, = P 3 

With these results, (40) becomes 

x' = X cos a + y sin a 
y' = ~x sin « + y cos a 


This clearly represents a rotation through an angle a about Z ; it may be 
suitably represented by 

r' = 


where r' and r are the vectors having components {x',y',z') and {x,y,z), 
respectively and 

r cos a sin a 0 

Rz{oc) = —sin a cos a 0 (15-43) 
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We have thus identified the element of SUG which corresponds to the last 
factor on the right of (37); it is Cfi. Obviously, Rgiy) corresponds to a 
matrix like (43) but with a replaced by 7 . 

In order to find Ry{0) we take 


"cos 0/2 —sin 0 / 2 I 
_sin 0/2 cos 0 / 2 } 


(16-44) 


It is obtained by putting a = cos 0 / 2 , b = —sin 0/2 in (24). We now find 
f/JPi U 2 = cos 0 Pi + sin / 3 P 3 
UIP2U2 = P2 
UiP 3 U 2 = —sin 0 Pi + cos 0 P 3 
and (40) may be written r' = Ry(0)T, where 

cos 0 0 — sin 

Ry{0) =01 0 (15-45) 

_sin 0 0 cos 0 _ 


Our notation, Ry{0)j is meant to exhibit the fact that (45) represents a rota- 
tion through 0 about Y. Thus we have shown that by a proper choice of 
the elements of U, SUG and R'^(3) are isomorphous since U = 
Ui(y) 1 / 2 ( 0 ) Ui{a) corresponds to R(a,0,y). 

Let us write U(a,0,y) = ^ 1 ( 7 ) 1 / 2 ( 0 ) Ui(a) 

^ pn/2 Q 'T QOS 0/2 -sin 0 1 

^ Lo 0/2 cos 0/2\l 0 

|^^i(a-7)/2 ^^2 ^-T(a4-7)/2 0/2j 


On comparing this with (24), we see that we have a == ^^/2 

b = — ^/2^ ^31) becomes 

r;0)r « ^ m)\(j - m)\{j+q)]{j - q)\ 

z. ^ !(j + !(f _ g + ^) !^ ! 

X cos^^^-^^-^^0/2 • sin"^-‘^-^2«^/2 • (15-47) 


As before, j = 0, 1, f, • • For j = 0, we get U^^^(a,0,y) = 1; 

for j = §, we obtain (46). It may be shown^^ that the matrices whose 
elements are given by (47) are irreducible representations and that there 
are no further ones. The characters of the representations are found from 

See footnote ) 7 
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(35). Remembering that = cos x dr i sin x, they may be written as 
(a) = 1 + 2 cos a + * • • + 2 cos ja; if ji = 0, 1, 2, • • • 

(a) = 2 cos a/2 + 2 cos 3 q;/ 2 + . . 2 cos ja; 

' = - ( 15 - 48 ) 

Although R“^(3) is isomorphous with SUG, the isomorphism is not 
simple. If 0 < a < Att, 0<i3<7r, 0<7< 27r, then as a, p and y take 
all values between these limits, a and b of (24) will take all pairs of values 
satisf3dng the requirement aa* -f- 66* == 1 once only. On the other hand, 
if a, and 7 are Eulerian angles their limits are 0 < a < 27r, 0 < p < w, 
0 < 7 < 27r. But the angles occur in (46) divided by 2, hence the trigono- 
metric functions are undetermined with regard to sign. In other words, 
every matrix R{a,ff,y) is isomorphous with two matrices U(a,fi,y). We 
must thus discard half of the representations of SUG in order to find the 
ones appropriate to R“^(3). It is easy to see which ones we want. From 
(47) it follows that 

U^{a + 2irAy) = 

Now when j is integral, q is also integral, for —j<q< j and then = 1. 
If j were half integral, the identical rotations a and « + 27r would have 
representations differing in sign. However, R{a,fi,y)U{afi,y) for both 
integral and half-integral j values is a group which is isomorphous with 
U(a,fi,y) and all matrices (a, 1^,7) are representations. This group is of 

importance in the Pauli spin theory.^^ 

If we take as elements of an infinite group, all real unitary matrices of 
order three with determinant expial to +1 ay well as —1, we have the 
three-dimensional full real orthogonal group R^(3). The quotient group 
isomorphous with it has two elements. The unit element, which is also an 
invariant subgroup of R^(3) contains E and all proper rotations R such as 
(43) or (45) with \ R \ = +1. The other element of the quotient group 
is an infinite number of improper rotations T with j T | = —1, a typical 
one (cf. sec. 10.17) being 

cos (f) sin <t> 0 

Tt{<j)) — — sin 0 cos </) 0 (15-49) 

0 0-1 


The simplest member of the class of T is an improper rotation by the 
angle tt, an operation called inversion 

r-1 0 01 


r(7r) = I = 


(16-50) 


Chapter 11. 
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It is always possible to find some improper rotation T which will convert 
any other improper rotation T' into an inversion, T~^T'T = /, just as it is 
always possible to find an inverse to a proper rotation, ET^R'R = E. The 
group R^(3) may thus be considered as the direct product of R’^(3) and the 
group I, the latter having elements E and /. It will have two irreducible 
representations for every value of each being of dimension (2j +1). 
The element R has two representations both equal to while T has 
representations, 

16.16. The Two-Dimensional Rotation Groups. — The two-dimensional 
pure rotation group R'^(2) is a subgroup of R“^(3). Its elements are the 
proper rotations in a plane perpendicular to a fixed axis. Let /?(</>) be one 
of the elements where 0 < 0 < 27r, then if x is a vector with components aq 
and X 2 i the element R{(t>) may be represented by the matrix C{<t>) 


with 


= C(</))x; 0 < ^ 27r 


(15-51) 



cos (f) 
— sin (p 


sin (f> 
cos </), 


(15-52) 


If R(<t>') is another element of the group, which is represented by C{<t>'), then 
C (0)^(0^) = C?(0 + 0 O = C{<p^)C((p) (15-53) 


and the group is Abelian. Referring to sec. 15.11, we see that for such 
groups, each element is in a class by itself and the irreducible representa- 
tions are one-dimensional. Thus (52) is reducible, a unitary matrix of 
eigenvectors of C being required for that purpose since C itself is an or- 
thogonal matrix. The normalized eigenvectors of C are found to be 

“ \72 ^ ^ ^ R2 ~ \/2 ^ ( 15 — 54 ) 


and the eigenvalues are These, then, are characters of an irreducible 

representation. However, there are an infinite number of classes, so there 
must be an infinite number of representations for each class. The corrcv 
sponding characters may be taken as 

^(m) ^ = 0, ±1, ±2, • • • (15-55) 


for each will satisfy the multiplication requirement of the group, as indi- 
cated by eq. (53). 

The two-dimensional rotary reflection group (2) is composed of both 
proper and improper rotations. A typical element of it is represented by 
the matrix 




cos <t> sin 
-d sin (f> d cos 0« 


sin <t> 
cos 0« 


(15-56) 
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where d equals the determinant of A((j>,d) and may be either -f 1 or — 1. 
If d =» +1, we have a proper rotation with matrix C(<^); if d = —1, an 
improper rotation, the matrix of which will be indicated by S(<f>) 

Lsm <^ — cos (/> J 

Clearly, 

S^{4>) = S{<l>)S{<t>) = E (16-68) 

but the group is not Abelian for and S(0) do not commute. In fact, 

A{<t>,d)A{<t>',d') = A{d'<t> + <t>',dd') (15-69) 

Let us reduce S(0) to diagonal form (cf. sec. 10.17). Its eigenvectors 
are found to be 

Vi = {cos 0/2, -sin 0/2} ; V2 = {sin 0/2, cos 0/2} (15-60) 

and the eigenvalues are dtl. The resulting diagonal matrix, 


which corresponds to a reflection through the axis of rotation, is that 
obtained from S(0) when 0 equals 0 or 27r. It is interesting to observe that 
the matrix of eigenvectors, eq. (60), is actually a proper rotation by the 
angle 0/2. Moreover, 

S(0) = (rn0/2)o'C(0/2) (15-62) 

hence, an improper rotation is equivalent to a proper rotation by the angle 
0/2, followed by a reflection and finally by a proper rotation of 0/2 in the 
opposite direction. 

It will be remembered that every element of the group R'^(2) is in a 
class by itself. It does not follow, however, that the proper rotations of 
R=*=(2) are each in a separate class. Thus the element represented by 
C(0) is in the same class with the element represented by C'(0), since 

S-^(0')C(0)S(0') = c'(0) 

where 

C' "““♦I 

Lsin <t> cos <t>j 

and 

C'i4>) = C-\<l>) = C(-<A) 

There are an infinite number of classes as before but each class contains the 
proper rotation by 0 and the proper rotation by — 0. 

On the other hand, all improper rotations are in the same class. If 
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S{(t>) and S ((!>') are representations of two improper rotations, we find that 

Cr'(</>")S(0)C(<^") - S(0') (15-62a) 

where 0 ' = 0 + 20". This could have been inferred from eq. (62), for it 
is a special case of (62a) when we set 0 " = 0/2,0 = Oor27rand0' = 0. 

The representations of eq. (56) may be reduced by the matrix of eigen- 
vectors (54). The result is 

[ im0 Q q 

^ (15-63) 

^ J (15-64) 

with m = 1. However, when m == 0, 1, 2, • • • the same matrices also 
satisfy the multiplication requirements of the group. They are irreducible 
except when m = 0. There, we obtain (see problem at the end of this sec- 
tion). 

C^°'H4>) = 1; = -1 ( 5 65) 

A slightly different procedure is sometimes desirable. We see from 
eq. (62) that any improper rotation may always be written as a combina- 
tion of a proper, rotation and a reflection. The elements of the group 
could thus be considered as an infinite number of proper rotations and the 
single improper rotation which is represented by cr. When the latter is 
reduced by means of (54) we obtain 

r(0,-l) = [® J] (15-64a) 

Thus the irreducible representations are those of (63) and the single one of 
(64a). When m = 0, we again get (65). 


Problem. Show that both (63) and (64) may be reduced to diagonal form with the 


matrix 



16.17. The Dihedral Groups. — An important subgroup of R"^(2) is 
obtained by restricting the values of 0. Consider a regular polygon in the 
XF-plane with coordinates of the n corners 

Xk = r cos 2Tkln] yk — t sin 2Trkln\ (fc = 0, 1, 2, • •, n — 1) 

where r is the radius vectmr from the origin to the corner. Now if 0 in (56) 
takes the value 27r/», the matrix A(0,d) will transform the polygon into 
itself by either a proper or an improper rotjation. The elements of the 
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group will be indicated by C in the former case a,rid by S in the latter. The 
corresponding matrices are i4(27r/n,d) with the appropriate choice of d, 
but we will find it convenient again to use C and S for the matrices, dis- 
tinguishing between the abstract element and its representation by means 
of different type. The whole group is finite and of order 2n; it is called the 
dihedral group Dn- It may be generated by the relations 

= E; = E) SC = C-'S (15-66) 

We now see why the group of sec. 15.1 was called D3. If we let n = 3 in 
eq. (66), we will have the generating relation of eq. (1), provided we 
reletter the elements C and aS of (66) so that they read A and C, respec- 
tively. 

Suppose q is an integer; then we may write n = 2g -f 1 if n is odd or 
n == 2^ if n is even. There will be (g + 1) classes among the proper rota- 
tions for both n even and n odd. These will correspond to C, C^, • • •, 

(79, = E. For n odd, there will be one additional class, that of S. For 

n even, there will be two classes involving an improper rotation. The 
separation into classes for both cases is illustrated in the problem in this 
section. 

If n is odd, the classes for proper rotations will be represented by 
and of eq. (65) and q matrices of (63) with m = 1, 2, • • •, q. The 


TABLE 6 

nodd; q = {n — l)/2; 4 > = 2 w/n 




CiE) 

e(c) 

(C«) 

e(.s’) 



r(0) 

1 

1 

1 

1 



r(o') 

1 

1 

1 

-1 



pCD 

2 

2 cos 4 > 

2 cos q 4 

0 



p(2) 

2 

2 cos 24 

2 cos 2 q 4 

0 



p(<?) 

2 

2 cos q 4 > 

2 cos q ^4 

0 





n even; q — n /2 





e(E) 

e(C) 


e(c») 

(?(S) 

e(s') 

pCO) 

1 

1 

1 

1 

1 

1 

r(o) 

1 

1 

1 

1 

-1 

-1 

p(«) 

1 

-1 


(-1)® 

1 

-1 


1 

-1 


(-1)" 

-1 

1 


2 

2 cos 4 > 

• • • 2 COS (5 — 1 )i^ 

2 cos q 4 

0 

0 

r(2) 

2 

2 cos 24 

• • • 2 COS 2 {q — 1)0 

2 cos 2 q 4 

0 

0 

r(ff-i) 

2 2 cos {q — 

1)4 • • • 2 cos (g — 1)V 

2 cos qiq — 1)4 0 

0 
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class of <S is represented by the remaining one-dimensional matrices of (65) 
and q matrices like (64) or (64a). If n is even, the situation is similar, 
except for the case m = n/2 when (63) and (64) become 

C(’*/2)(^) = S(n/2)(^) ^ j'O 

This representation may be reduced to give = —I; = 1; 

QiQ) = — 1; §(«) = — 1. Hence, when n is even there are four repre- 
sentations of degree one and (^ — 1) of degree two. The characters of 
dihedral groups are shown in Table 6. 

Problem. Show that if n = 6, the classes of the group are (?(J5^) - E\ Q{C) = C, 
(?(C2) = C\ c^• (?(C3) == e{S) = S, C\S; CiS') = CS, C^S. 
If n - 5, show that the classes are C(E) - B; C(C) = C, C*; C(C^) = C*; 

6(3) = 3, C3, C% C^3, C^3, 

16 . 18 . The Crystallographic Point Groups. — By considering all opera- 
tions which transform certain solid geometric figures into themselves, we 
obtain a number of finite subgroups of R^(3), called the crystallographic 
point groups. They are of considerable importance in the study of crystal 
and molecular structure.^^ We assume that one point of the figure is 
fixed in space so that if we know the position of two more points which are 
not collinear with the fixed point, the position of the figure is completely 
determined. Under these conditions, the only possible types of motion are 
rotations around an axis passing through the fixed point and reflections in 
a plane containing that point. All other motions may be reduced to a 
combination of these two, for as we have seen in sec. 15.16 any improper 
rotation may always be written as a product of two proper rotations and a 
reflection. When the improper rotation is an inversion (i.e., improper 
rotation by the angle tt) a point will be collinear with its original position 
and some fixed point on the axis of rotation, hence an inversion is uniquely 
determined by the position of this fixed point and is independent of the 
position of the axis. The fixed point is called a center of inversion. 

We thus have four fundamental operations: (a) a proper rotation Cn 
by an angle 0 = 27r/n (n is an integer) about an n-fold axis of rotation; 
(b) reflection in a plane, indicated by cfhy <^di (subscripts h, d and v refer 
to horizontal, diagonal and vertical planes); (c) an improper rotation, Sn) 
(d) inversion, indicated by /. 

Selected sets of such operations, together with a unit element which 
leaves every point of a figure unchanged, are the elements of the crystallo- 

^®More details about the crystallographic point groups are given by Schoenflies, 
“ Theorie der Kristallstruktur,’’ Berlin, 1932. Their application to problems of molecu- 
lar structure is discussed by Rosenthal, J.. and Murphy, G. M., Reo, Mod, Phys. 8, 317 
(1936V 
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graphic groups. The number of these groups which are of interest in 
physical problems is limited by the fact that we need to consider only 
those types of symmetry which occur in crystals or molecules. Actually 
32 such groups are sufficient to treat all crystals which occur in nature, but a 
few more of similar type are needed for certain molecules. Some iso- 
morphism exists among the groups, so that it is not necessary to construct 
32 different character tables. 

The groups themselves are indicated by the following symbols and 
names which will be explained in the next paragraph: (a) cyclic groups, 
Cn (n = 1, 2, 3, 4, 6); (b) dihedral groups, (n = 2, 3, 4, 6); (c) cubic 
groups, T and O. The remaining groups: Cnh {n = 1, 2, 3, 4, 6); Cnv and 
Dnh {n = 2, 3, 4, 6); Dnd (n = 2, 3); Cni (n = 1, 3); S4 , Ta, and Oh 
are either isomorphous with some group in the preceding list or the direct 
product of some group there and I, the latter being isomorphous with C2. 

The system used here for designating the crystallographic groups is due 
to Schoenfiies. The symbols^^ C^, Sn and I are meant to indicate that the 
group contains elements Cn, Sn and /, respectively. The dihedral groups 
Dn have been discussed in sec. 15.17. The cubic groups T and O are the 
groups of rotation of the tetrahedron and octahedron. Subscripts /i, v and 
d refer to one or more symmetry planes horizontal, vertical or diagonal with 
respect to some axis of symmetry. The presence of a center of symmetry is 
shown by the subscript i. The simplest way of studying the various 
symmetry elements of a given group is by means of the stereographic 
projections^^ of the corresponding solid figures. 

In the tables at the end of this section we present the characters for 
all of these groups. It is convenient to indicate a class by means of sym- 
bols like Cn, Sn or o-, a typical element of it. If a number precedes the 
symbol it is the number of elements in that class, otherwise the class in 
question contains but one element. Representations of degree one are in- 
dicated by A or R; of degree two by E (except for certain cases, where 
two one-dimensional representations occur in pairs); of degree three by T. 
When two one-dimensional representations A and B occur in the same 
group, it will be found that the character of A is + 1 for the class represent- 
ing rotation by 2Tr/n around the principal n-fold axis and — 1 for B, The 
principal axis is always taken in the direction of Z. Different representa- 
tions of similar symmetry to reflection in a plane perpendicular to the 
principal axis are indicated by ' and " while subscripts g and u refer to 
positive and negative characters for the class of I . 

Additional groups needed in the study of molecular structure are DnA and Cm, 
(n == 5, 7, 8); see Wilson, E. B., J. Chem. Phys. 2, 432 (1934). 

These symbols are explained in the references of footnote 22, or see Sponer, H., 
and Teller, E., Rev. Mod. Phys. 13 , 76 (1941). 

They are given by Sponer and Teller, loc. cit. 



16.18 


GROUP THEORY 


556 


Methods of finding the characters'^ for cyclic and dihedral groups 
have already been described in detail. The cubic groups O and T are the 
symmetric and alternating groups on four letters; they have been discussed 
as examples of permutation groups. The remaining groups, which are 
indicated as a direct product will have twice as many classes and repre- 
sentations as appear in our tables. Each representation given there will 
occur once with the subscript g and once with u (except for € 3/1 where the 
representations are A', E\ E''), For example, C^t = € 3 X 1 will 
have classes £*, C3, /, /C3, 7C|. The classes which are found in 

will have the same characters as C3, once as g and once as u while the new 
classes will have the same characters as C3 for (/-representations and the 
negative of those for it-representations. Groups having the same character 
table are isomorphoiis. 

For convenience of reference, we also include the infinite group D«, 
which is isomorphous with both R“^ (2) and C cov and the group D ooA = D 
X I which is isomorphous with R^(2). 

One further question of interest here concerns the transformation 
properties of a vector when subjected to the operations of a crystallo- 
graphic group. We have shown, in sec. 15.15, how a vector is transformed 
by the elements of the group R^(3). The representations from which this 
effect is immediately seen are given by (43) and (49), the characters of 
which are 

Sr = 1 -f- 2 COS 0 ; Sy = — 1-1-2 cos 0 

The same characters must also apply to the crystallographic groups since 
they are subgroups of R^(3), but it does not follow that the characters 
remain irreducible. As an example, consider the group C4 where all of the 
classes involve proper rotations. The angles for the classes of Ey C2, C4, C4 
are 0, tt, 7r/2, 37r/2, hence E/e == 3, — 1, 1, 1. Comparison with the charac- 
ter table for C4 shows that these numbers are the sums of the characters for 
the representations A and E. The reader should draw a figure of the 
appropriate symmetry which in this case is a square. Let the Z-axis be 
perpendicular to the plane of the paper, then it is immediately obvious that 
z transforms like A for z is unchanged by the operations of the group. 
When the operation C 2 is applied to the figure (i.e., rotation by tt) x is 
transformed into —a: and y into — y, hence x + iy becomes — (a; zb iy). 
Proceeding in this way with the other elements of the group, it will be 
seen that x + iy transforms like the first set of characters for E in Table 7 
and X — iy like the second set. For S4, the last two classes are improper 
rotations with Sy = —1, —1, hence the reducible characters are 3, — 1, 
— 1, --lor5 + ^^; 2 transforms like B and x -f iy like E. We have indi- 

The reducible representations of all of the crystallographic groups are given by 
Seitz, Z. KristallographiCy A88, 433 (1934). 
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cated all of these transformation properties in our tables. When two or 
more groups are isomorphous and the representations are the same (exam- 
ples, D4, C4V and D2d or C4 and S4), the characters for the coordinates refer 
to the first group of that table. To obtain them for the other groups, one 
must change the sign of the characters for the improper rotations, for 
example, z transforms like A2 for D4, like Ai for C4V and like B2 for D2d- 


TABLE / 
CYCLIC GROUPS 
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TABLE 7 (Continued) 


DIHEDRAL GRODPS 
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TABLE 7 (Continued) 
DIHEDRAL GROUPS (Continued) 
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CUBIC GROUPS 
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-1 
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-1 
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3 
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16.19. Applications of Group Theory. — Since group theory is concerned 
with symmetry properties, its mathematical methods should be useful in 
many physical problems. Its most obvious application consists in the 
classification of crystals and polyatomic molecules according to a group of 
the appropriate symmetry. It is natural to inquire whether group theory 
may also be used in quantum mechanics. For systems containing a num- 
ber of particles, calculations by the usual methods are difficult; hence it is 
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fortunate that the symmetry of such problems can be utihzed to some 
extent in their study.^^ 

The SchrSdinger equation for a system of n identical particles (elec- 
tron, protons, etc.) may be written as follows: 

//(1, 2,. . . n)^(l,2,. . . n) = ^^(1,2, • • • n) (15-68) 

where the numbers 1, 2, • • • m appearing in both the Hamiltonian operator H 
and the state function yp indicate that these quantities depend on the 
coordinates of particles 1, 2, • • • Now it is clear that, if the coordinates 
of particles i and j are interchanged everywhere in eq. (68), the latter 
remains valid, for such an exchange amounts to no morc^ than a relabelling 
of the particles. . This fact might be indicated formally by applying to (68) 
the operator Pij defined as effecting an interchange of particles i and j : 

P, ,//(!, 2,- • . Ai)P,,V^(l,2,. . . .0 = EP^J^P(l,2,■ • . n) 

But the functional form of FI is unaltered when P^J is applied, regardless 
of its specific form, provided the particles are identical, hence this equation 
reads 

HPi,^ = EPi,^P 

In other words, P^Jp is also an eigenfunction of i/, and one belonging to the 
same eigenvalue E. 

Now Pij is an element of the symmetric group on n particles. There- 
fore the state of affairs described above is usually expressed by saying that 
the Schrodinger equation is invariant under the symmetric group. Permu- 
tations are not the only operations under which the wave equation is invari- 
ant. Suppose the nucleus of an atom is considered as a fixed field of force; 
then rotations and reflections at this point leave the energy of the system 
unchanged (i.e., the operator H is invariant with respect to them). The 
groups in question are those of sec. 15.15. If the atom is in a homo- 
geneous electric or magnetic field, the appropriate group is the subgroup 
of rotations about a fixed axis (see sec. 15.16). For a diatomic molecule, 
the two nuclei are the centers of force (as a first approximation) and the 
groups are those of rotation about, and reflection in a plane through, the 
line joining the nuclei. If the nuclei are identical (as in hydrogen, oxy- 
gen or nitrogen) reflections in a plane perpendicular to the internuclear 
line (i.e., exchange of the nuclei) must also be included. For a polyatomic 
molecule, the potential energy has the same symmetry as the molecule 
itself, hence the wave equation is invariant to some one of the crystallo- 

^ Applications of group theory to quantum mechanics are discussed by Wigner, 
Bauer, Eckart, loc. cit. and by van der Waerden, Gruppentheoretische Methode in der 
Quantenmechanik,'’ J. Springer, Berlin, 1932; Weyl, H., “ Theory of Groups and 
Quantum Mechanics,” Methuen, London, 1931. 
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graphic groups. These examples should be sufficient to indicate the kinds 
of groups which are of importance in quantum mechanical problems. Each 
case must be studied individually and all groups under which the particular 
Schrodinger equation to be studied is invariant must be taken together to 
form the complete group of the Schrodinger equation. 

As a simple example of the method, consider the one-dimensional wave 
equation^® 

{ dP 1 

r ^ = WHx) (15-^9) 

where the potential energy is of such a form that 

r(.r) = T(-.t) 

and where the energy state W is non-degenerate, as is nearly always true in 
such one-dimensional problems. Suppose Pi is an operation which 
replaces xhy —x wherever it occurs in (69). Then 

Pjypix) = \l/'{x) = yp{-x) 

the result being a new ^-function, which has the same value at x as the 
old one, had at —x. The new ^-function will, however, satisfy the wave 
equation as well as the old one, with the same value of W, Hence it must 
be a constant multiple of ^(x), i.e., = op. But if both \p and \p^ ore to be 

normalized, c can only be +1 or —1. This result recalls the well-known 
fact that all eigenfunctions of eq. (69) are either even or odd functions of x. 

To exhibit the connection with group theory we note here the following 
facts which will be illuminated subsequently. Let Pe be an operator that 
replaces x by itself. Then 

PE^ix) = ^(x) 

Combining Pe with Pj we obtain a group, 

PjPe = PePi -Pi; Pi = Pe 

which is isomorphous with Ci (sec. 15.18), and others mentioned in pre- 
ceding sections. It has two irreducible representations both of degree one 
(see Table 7). The representation Ag corresponds to even eigenfunctions 
andAu to odd ones. 

Next, let us suppose that the Hamiltonian operator is invariant to a 
group of linear substitutions, such as /?, S, etc., and that the ^-function 
depends on n coordinates xi • • • Xn. These may be combined to form a 
vector X. If, then, 

x' ^ Rx 

We now use W for the eigenvalue in this section, reserving E for the unit element 
of a group. 
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we may define an operator Pr which changes i/'(x) into ^(x') : 

Pr^P{x) = ^(x') 

Now consider two cases: 

a. ^(x) is non-degenerate. Since, from the invariance of /jT, Pr\1/ 
must also satisfy the Schrodinger equation with the same TF, it must be 
identical with ^ (except for a constant multipier). 

b. ^(x) belongs to an eigenvalue W which possesses an a-fold degener- 
acy. We may then label the a. linearly independent functions 

^1, ^2, • * *, 

The effect of Pr on ^ will then be to convert it into a linear combination of 
for such a combination is the most general function belonging to W . 
Hence 

Pr'Pi = lj4^kD{R)ki 

k = \ 

D{R) being a certain matrix associated with the operator Pr. Similarly, 

P^k = Ili^jD{S)jk 

and 

PsPR^i ^ i:4'jD{S),,D{R)ti 

j, k 

= ZhmS)D{R)U (15-70) 

3 

From sec. 15.7, it should be clear that the matrices whoso elements appear 
on the right of eq. (70) are representations of the group of the operators 
Pri Ps' The dimension of each representation equals the number of 
linearly independent ^-functions, hence it is also equal to the degeneracy 
of the eigenvalue. If the original set of ^-functions were not linearly inde- 
pendent the resulting representations would be reducible. When the com- 
plete set of irreducible representations is obtained we see that each one 
would correspond to an eigenvalue of the quantum mechanical problem. 
The value of group theory in quantum mechanics is thus evident. From 
the symmetry of the system and without solving the wave equation we 
may obtain the possible number of eigenvalues and the degeneracy of each. 
Moreover the eigenvalues may be classified with regard to the particular 
representation to which it belongs. This fact is of considerable interest 
to the spectroscopist in studying the possible number and the symmetry 
of the energy levels to be expected in a given case. For example, as indi- 
cated in an earlier paragraph of this section, the group for the diatomic 
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molecule is R=^(2) of sec. 15.16. This has an infinite number of represen- 
tations w = 1, 2, 3, • • • and two representations for m = 0. These coiTe- 
spond to the electronic energy levels^® H, A, 4>, • • • for m = 1, 2, 3, • • • 
and 2“ for m -= 0. 

One further problem of importance is the determination of selection 
rules for allowed transitions in atomic and molecular spectroscopy. As 
shown in sec. 11.28 these depend on the matrix elements of the electric 
moment. The latter is itself a vector and its components will transform 
under the operations of the group like x, z or some combination of these 
components as shown in Table 7 for the various symmetry groups. The 
^-function of a given state will also belong to some irreducible representa- 
tion of the group. The product of a component of the electric moment and 
the i/'-function will transform like the direct product of the representations 
for each. This direct product will often be reducible and when reduction 
is effected, the result will be a sum of representations of the symmetry 
group. Transitions are allowed only to these states. Actually it is not 
necessary to know the representations themselves as a knowledge of the 
characters alone is sufficient. The reader should refer to other sources^^ 
for the details of the theory. A simple example will show how the method 
is used. Suppose a given energy level is known to have a ^-function 
which transforms like E 2 in the group De. Then for an electric moment 
along 2 , the direct product of the characters of A 2 and E 2 is 2, 2, 
— 1, — 1, 0, 0, hence the only allowed transition from E 2 is to another state 
of the same symmetry. If the component of electric moment {x ± iy) is of 
interest, the characters are those of Ex times £2 or 4, —4, 1, — 1, 0, 0 which 
is a sum of characters for fii, B 2 and Ex. Transitions are allowed from £2 
to either £ 1 , £2 or £1 but to no others for the (x i iy) component of electric 
moment. 

For a complete treatment of applications of group theory to quantum 
mechanics the references on page 560 should be consulted. Wigner and 
Eckart have been concerned mostly with atomic problems. The diatomic 
molecule is discussed by van der Waerden; the polyatomic molecule by 
Casimir^^ and by Rosenthal and Murphy. Recently, group theory has 
been applied to nuclear problems.^^ 

These are the customary symbols in molecular spectroscopy ; see, for example, 
Herzberg, G., Molecular Spectra and Molecular Structure; Diatomic Molecules,” 
Prentice-Hall, Inc., New York, 1939. 

See Wigner, loc. cit. or Rosenthal and Murphy, loc. cit. 

Casimir, H. B. G., “ Rotation of a Rigid Body in Quantum Mechanics,” J. B 
Wolters, Groningen, 1941. 

Wigner, E., Phys. Rev. 61 , 106 (1937); Wheeler, J. A., ibid., 62 , 1083 (1937). 
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Abelian group, 527 
Abel’s integral equation, 523 
Abraham^ 160 
Absolute temperature, 30 
velocity, 275 

Accessory conditions, 204 
Adams ^ E. P., 172 
AdamSy N. /., ./r., 135 
Adiabatic expansion, 36 
Addition of matrices, 292 
of vectors, 135 

theorem for Legendre polynomials, 
105 

Adjacent path, 193 
Adjoint matrix, 295 
Aggregate, probability, 419 
Aitken, 301, 307 
Algebraic calculations, 474 
Alternating group, 538, 541 
Amplitudes, probability, 331 
Ames, 268 

Analogies between thermodynamic and 
statistical quantities, 443 
Analogues statistical, of thermody- 
namic quantities, 436 
Analysis, indirect chemical, 300 
Anchor rings, 185 
Angles, Eulerian, 268, 272 
Angular momentum, 271 
eigenfunctions of, 344 
eigenvalues of, 344 
in quantum mechanics, 322, 324 
internal, 279 
velocity, 140, 271 

Antisymmetric eigenfunctions, 400, 439 
Approximate quadrature, 457 
Approximation in the mean of func- 
tions, 262, 265 

Approximation method, for algebraic 
equations, 476 

for differential equations, 467 
for secular determinants, 486 
Arbitrary constants, 33 


Areas, vector, 139 
Arrangements, 415, 416 
Arrays of numbers, 287 
Assemblage of identical particles, 401 
Assignment of statistical weights, 440 
Associate matrix, 295 
Associated Laguerre function, 80 
Laguerre polynomial, 78, 102, 128 
Legendre functions, 68 
differential equation for, 68 
representation, 540 
spherical harmonics, 68, 69 
differential equation for, 68, 69,218 
tensor, 162 
Associative law , 526 
Atmospheric pressure, 36 
Atom in a magnetic field, 392 
Auxiliary equation, 49 
Average error, 493 
weighted, 497 
Axes, coordinate, 167 
Axial vector, 160 

Axiomatic foundation of quantum me- 
chanics, 319 
Axis of rotation, 271 

of symmetry, n-fold, 555 
Azimuth, 172 

Bacteria, 33 

Baggoit, 466, 474 
Baker, 474 
Balls in boxes, 422 
Barrier problems, 337 
Base vectors, 188 
Bauer, 413, 533, 546, 560 
Beattie, 460 
Becker, R,, 160 
Bernoulli’s equation, 44 
numbers, 457 
B-function, 93 
Bessel coefficients, 109 
functions, 75, 109, 128 
formulas involving, 117 
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Bessers differential equation, 74, 216, 
230, 522 
integral, 112 

interpolation formula, 453 
Bilinear form, 302 
Binomial coefficients, 417 
addition theorem of, 418 
theorems on, 418, 419 
Binomial expansion, 417 
theorem, 109 
Biotj 246 

Bipolar coordinates, 182 
Birge, 475, 487, 502 
Bliss y 193, 210 
Bloch’s theorem, 81 
Bochery 72, 299, 309 
Body, deformable, 164 
elastic, 164 
rigid, 268 

Bohr’s frequency condition, 386 
radius, 127 

theory, angular momentum, 324 
Bolzay 193, 210 
Boruy 21 y 87, 313, 413 
Boundary conditions of differential 
equations, 255, 503, 521 
Brachistochrone, 197 
Bridgman y 15, 17 
Brillouiny 413 
BrownCy W. R,y 11 
Brunt y 501 
Burington, 59, 151 
Burnside, 526 
Byerly, 210, 243 

Cable, hanging under own weight, 58 
Calculation, algebraic, 474 
Campbell, 246 

Canonical ensemble, 430, 432 
equations, Hamilton’s, 270, 427 
Canonically conjugate variables, 270 
Capacitance, 44 
CaraiModory, 13, 27 
principle of, 26 

Cartesian, coordinates, 167, 172, 322 
expansion for divergence, 145 
for scalar product, 137 
for scalar triple product, 142 
for vector product, 138 . 

Casimir, 563 
Catenary, 58 


Cauchy relations, 41, 88 
Cayley-Klein parameters, 273 
Central difference formulas, 453 
Central field motion, quantum treat- 
ment, 347 
Centrifuge, 38 
Chain, sliding over peg, 51 
Chapman, 415 

Characters of a representation, 534, 556 
tables of, 557 
Characteristic equation, 7 
of a matrix, 303 
functions, 240 
roots of a matrix, 304 
values, 240 

Charged cylinder, potential near, 220 
Chemical analysis, indirect, 300 
Chemical potentials, 25 
Christoffel three-index symbol, 162, 101 
Circular membrane, vibrations of, 248 
Clairaut’s eciuation, 47, 48 
Clapeyron’s equation, 38 
Classes, 416, 528 

Classical physics contrasted with quan- 
tum mechanics, 317 
Classification of eigenvalues according 
to irreducible representations, 
562 

Clausius, 11 , 24, 38 
Closed systems, 378 
Coefficient of diffusion, 233 
Coefficients, detached, 481 
Cofactor of determinant, 289 
Cogredient variable, 303 
Colatitude, 172 

Collar, 268, 291, 481, 482, 485, 486 
Collineatbry transformation, 302 
Combinations, 415 
of three vectors, 141 
with repetitions, 417 
Commutability of operators, 321, 332 
Compatible measurements, 332 
Complementary function, 53 
Complete differential, 82 
solution, 32 

Completeness of a set of functions, 242, 
243 

eigenfunctions, 328 
Complex integration, 88 
of a group, 529 
Components, of a tensor, 157 
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Components, of thermodynamic sys- 
tem, 26 

of a vector, 132 
of a curvilinear vector 169 
Composite functions of Del, 148 
Conditions on state functions, 320 
Condon, 342, 406, 413, 414 
Conducting sphere, in field of point 
charge, 221 
potential near, 219 
Conductivity, thermal, 35 
Configuration space, 320 
Confocal ellipsoidal coordinates, 173 
paraboloidal coordinates, 179 
Congruent transformation, 302, 307 
Conical coordinates, 178 
Conjugate variables, canonically, 270 
elements of a group, 528 
subgroup, 529 

Conjunctive transformation, 302 
Conservation of density-in-phase, 428 
Conservative system, 200, 267 
Constraints, 269 

Continuity, equation of, 147, 212, 383 
Continuous group, 542 
Continuous spectrum, 386 
of eigenvalues, 320, 324 
Contraction of tensors, 161 
Contragredient variable, 302 
Contravariant tensor, 158 
vector, 157 
Coolidge, 408, 486 
Coordinate axes, 167 
line, 167, 187 
surface, 167 

Coordinate systems, orthogonal, 168 
non-orthogonal, 187 
n-dimensional, 298 
Coordinates, bipolar, 182 
Cartesian, 167, 172 
confocal ellipsoidal, 173 
confocal paraboloidal, 179 
conical, 178 
curvilinear, 167 
cylindrical, 173, 186 
ellipsoidal, 174 
elliptic cylindrical, 177 
generalized, 427 
normal, 278, 311 
oblate spheroidal, 177 
of Lagrange, generalized, 269 


Coordinates, parabolic, 180 
parabolic cylindrical, 181 
prolate spheroidal, 175 
relative, 395 
spherical polar, 172, 186 
tensors in curvilinear, 187 
toroidal, 185 
Coset, 529 

Cosines, direction, 133, 168 
Cotes, formula of Newton and, 469 
Coulomb field, motion in, 349 
Courant, 211, 247, 253, 256, 265, 267, 
363, 504, 510 

Covariant derivative of tensor, 164, 
191 

Covariant tensor, 158 
vector, 158 
Cowling, 415 
Crarner^s rule, 299 
Crawford, 486 
Cross, 486 
Cross product, 138 
Crumpler, 450, 487 
Crystallographic point groups, 554 
Cubic groups, 555 
Curl in curvilinear coordinates, 170 
in tensor notation, 192 
of a vector, 147 

Current in quantum mechanics, 383 
Curve fitting, errors in, 501 
Curvilinear component of vector, 169 
Curvilinear coordinates, 167 
tensors in, 187 
Cycle, thermodynamic, 8 
of permutation, 538 
Cyclic group, 527, 537 
Cycloid, 198 

Cylinder, potential of charged, 220 
Cylindrical coordinates, 173, 186 
elliptic, 177 
parabolic, 181 

Damped harmonic motion, 51 
Darling, 285 
Darwin, 436, 443 
Darwin-Fowler method, 436 
De Broglie, 413 
formula, 382 
wave length, 336 
Definite form, 302 
Deformable l)ody, 164 
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Degeneracy, 282 
factor, 442 

due to particle exchange, 399 
Degenerate eigenvalues, 259 
Degree, of a cycle, 438 
of a differential equation, 32 
of a group representation, 531 
Degrees of freedom, 26, 268 
Del, 145 

composite functions of, 148 
successive applications of, 148 
Delta, Kronecker, 158, 294 
De Moivre^s theorem, 537 
Dennison^ 282, 285, 479 
Density, flux, 147 
Density-in-phase, 428 
Derivative, covariant, 191 
directional, 145, 169 
of tensor, 162 
of tensor, covariant, 164 
partial, 2 

thermodynamic, 15 
Deviation, standard, 494 
Descents, method of steepest, 443 
Desch, 26 

Determinant, cofactor of, 289 
complementary minor, 289 
definition of, 288 
differentiation of, 291 
expansion of secular, 483 
Gram, 130, 297 
Laplace development of, 289 
multiplication of, 290 
numerical evaluation of, 482 
numerical solution of secular, 483 
properties of, 289 
roots of a secular, 483 
solution of simultaneous equations 
by, 480 
value of, 288 
Wronskian, 130 
Diagonal matrix, 284 
Diagonalization of a matrix, 304, 316 
Diatomic molecule, 344, 560 
Difference, definition of, 451 
divided, 453 
formulas, central, 453 
of tensors, 159 
of vectors, 136 
table, 451 

Differential, complete, 82 


Differential, exact, 1, 8, 82, 151 
higher order, 5 
incomplete, 82 
inexact, 82 

Differential and integral equations, re- 
lation of, 514 

Differential equation, of Thomas-Fer- 
mi, 474 

numerical solution of, 465 
partial, 211 

Differential operator, 253 
in tensor notation, 190 
Differentiation, by polynomial method, 
456 

numerical, 455 
of determinants, 291 
of tensor, 162 
of vectors, 143 
partial, 1 

with interpolation formula, 455 
Diffraction of waves, 229 
Diffusion, differential equation of, 232 
quantum mechanical, 381 
Dimension of a group representation, 
531 

Dingle f 319 

Dipole, potential due to, 223 
Dipole moment, 56, 97 
waves, 231 
Dirac, 325, 413 
Dirac 5-function, 234, 325 
Direction cosines, 133, 168 
Directional derivative, 145, 169 
Direct i)roduct of groups, 536 
sum of representations, 532 
Dirichlet integral, 247 
Discontinuous potentials, 33cS 
Discrete group, 542 
Discriminants of a cjuadratic form, 308 
Dispersion, of a function, 420, 421 
of a statistical aggregate, 333 
Displacement, electric, 147 
operators for spins, 391 
Distribution law. Gauss, 487« 
quantum mechanical, 437 
Distribution of probability, 420^ 
Divergence, 146 
in tensor notation, 191 
theorem of, 154 
Divided differences, 453 
Divisor, normal, 529 
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Dot product, 136 
Double volume integrals, 366 
two-center problem, 408 
Dummy index, 157 
Duncan ^ 268, 291, 481, 482, 486 . 
Dushman, 413 
Dyadic, 158 
Dynamics, 166 

Eckart, 277, 342, 397, 545, 546, 560 
Eddington^ 166 
lugenf unction, 240 
(completeness of, 262 
of integral e(iuations, 509 
Eigenstates, simultaneous, 332 
Eigenvalue, 240 
degenerate, 259 
of a kernel, 509 
of a matrix, 304 
Eigenvectors, 304, 389 
Einstein-Bose, statistics, 440 
Eisenhart, 171 
Elastic body, 164 
Electric displacement 147 
flux, 146 
polarization, 156 
Electricity, 166, 185 
Electrodynamics, 175 
Electromotive force, 42 
Electron, 402 
spin of, 386 

IClement, conjugate, 528 
in probability theory, 419 
line, 168, 187 
of a group, 526 
surface, 168, 187 
volume, 168 
Ellipsoid, ovary, 176 
planetary, 177 
Ellipsoidal coordinates, 173 
Elliptic, cylindrical coordinates, 177 
function, 59, 175 
integral, 175 
Emde, 94, 116, 248 
Emission, radioactive, 423 
Empirical formula, 499 
error in, 501 

Energy, internal, 11, 435 
kinetic, 269 
potential, 171 
shell ensemble, 430 


Ensemble, 428 
canonical, 430 
Gibbsian, 426 
microcanonical, 430 
Enthalpy, 13 
Entropy, 12, 435, 448 
Envelope, 48 
Epstein, 1, 181 
Equation, homogeneous, 299 
inhomogeneous, 299 
integral, 503 
linear, 299 

of a matrix, characteristic, 304 
of continuity, 147 
of state, 7 

of Sturm-Liouville, 516 
solution of integral, 504 
E(|uations, numerical solution of, dif- 
ferential, 465, 472, 473 
simultaneous, 476, 480 
transcendental, 474 
Equations of motion, Hamilton’s, 270 
Lagrange’s, 269 
Newton’s, 268 

Equilibrium, heterogeneous, 26 
thermodynamic, 14 
Equivalence of linear operators and 
matrices, 358 
E<|uivalcnt matrices, 301 
Erg(xlic hypothesis, 427 
^^rror, average, 493 — 
determinate, 487 
function, 488* 
in empirical formulas, 501 • 
of a function, probable, 498 
probability of, 488 
probable, 493 
random, 487 
root mean square, 493 = 

Essential observability, criterion of, 
318 

Euler, angles, 268, 272, 352, 546 
definition of gamma function, 90 
equation, 195, 256 
integral, 93 

Maclaurin formula, 457 
Mascheroni constant, 92, 411 
method for differential eciuations, 
468 

Even and odd functions, 99, 561 
Exact differential, 1, 8, 27, 151 



570 


INDEX 


Exact differential, equation, 41 
Exchange, degeneracy, 399 
integral, 371, 412 
forces, 371 

Exclusion principle, 395, 399 
Expansion, adiabatic, 36 
coefficients, 358 

Expected mean, in Gibbs phase space, 
429 

of a sequence of observation, 326 
Explicit function, 2 
Exponential integral, 411 
Extensive variable, 1 
Extremal, 195 

Factor group, 530 
Fermiy 384, 474 
Fermi-Dirac statistics, 439 ff. 

Field, scalar, 144 
vector, 144 
Field strength, 156 
Figures, significant, 450 
Findlay y 26 

First order, perturbation, 373 
simultaneous differential equations, 
numerical solution of, 472 
Flow of fluid, 146 
heat, 34, 155 
particles, 383 
Floquet’s theorem, 80 
FliiggCy 414 
Fluid, flow, 146 
incompressible, 219 
Flux, density, 147 
electrical, 146 
thermal, 146 

Forbidden transition, 386 
Force, 137 

acting on particle, 268 
moment of, 140 
Form, bilinear, 302 
discriminants of a quadratic, 308 
Hermitian, 314 
positive definite, 302 
quadratic, 302 
semi-definite, 302 

Formula, Bessel's interpolation, 453 
central difference, 453 
empirical, 499 

general remarks on quadrature, 464 
interpolation, 451, 452 


Formula, Lagrange’s interpolation, 
453 

Stirling's interpolation, 453 
Forsythsy 67, 75 
Fostery 246 
Fourier analysis, 241 

of odd and even functions, 245 
Fourier-Bessel, expansion, 251 
integral, 250 
transformation, 229, 235 
transforms, 248, 250 
Fourier integral, 246 
series, 107 
theorem, 247 

Fourier transformation, 234 
Fourier transforms, 246 
Fowlevy 436, 443, 449 
Franky 211, 253 

Frazevy 268, 291, 481, 482, 485, 486 
Fredholm's integral eciuation, 503 
Free energy, Gibbs, 14 
Helmholtz, 13, 435, 448 
Free particle, 268, 380 
Freedom, degrees of, 268 
Frenkel y 413 

Frequency condition, Bohr's, 386 
Frequency, relative, 419 
Fuchs' theorem, 60 
Full linear group (FLG), 542 
Function, elliptic, 175 
even, 99, 561 
implicit, 6 
odd, 561 
potential, 269 
probable error of, 498 
scalar point, 144, 169 
Functional determinants, 17 

Gamma function, 75, 89 
space, 427 
Gas space, 427 

Gauss, differential equation, 72 
error function, 381, 421, 423, 487 
method, for numerical integration, 
462 

theorem, 154 

Generalized coordinates, 269, 427 
momentum, 269 
General solution, 32 
Generating functions, 128 
Geodesics, 195 
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Gibbs, 1, 13, 24, 136, 158, 188, 426, 427 
ensembles, 426 
Goranson, 17 
Goudsmit, 387 
Gradient, 145 

in tensor notation, 190, 191 
Gram determinant, 130, 297 
Gray, 116 

Graeffe’s method, for roots of a poly- 
nomial, 477 

solution of secular determinants, 483 
Green's formula, 518 
Green's function, 516 
examines of, 521 
modified, 519 

Green's theorems, 156, 237 
Gregory's formula, 459 
Group, Abelian, 527 
alternating, 538, 541 
continuous, 542 
crystallographic, 554 
cubic, 555 
cyclic, 527, 537 
definition of, 526 
dihedral, 552, 553 
discrete, 542 
factor, 530 
full linear, 542 
full, real orthogonal, 549 
mixed continuous, 542 
octahedral, 555 
point, 554 
quotient, 530 
rotary reflection, 550 
rotation, 545 

special unitary (SUG), 543, 547 
symmetric, 538, 541 
tetrahedral, 555 
unimodular unitary, 543 
unitary, 542 
velocity, 382 

Group theory, applications of, 559 
Group character, tables of, 557 
Growth, organic, 33 
Gurney, 414 

Hamilton’s canonical equations, 270, 
427 

principle, 199 
Hamiltonian function, 270 
operator, 208, 325 
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Hamiltonian function, operator, time 
dependent, 380 
quantum mechanical, 285 
Hankel function, 76, 114 
Harmonic function, 215 
motion, 50 

Heat capacity, 11, 12 
content, 13 

conduction, differential equation of, 
232 

flow, 34, 155 
linear, 233 

two and three dimensions, 235 
Heine's formula, 105, 106 
Heisenberg, 404, 413 
matrix theory, 355 
uncertainty principle, 332 
Hei tier- London method, 408 
Helium atom, normal state of, 364 
excited states, 402 
ionized, 349 

Helmholtz' equation, 39, 42 
free energy, 435 
function, 69 
Ilermitian form, 314 
matrix, 296, 314 
operator, 255, 328, 358 
scalar product, 314 
vector space, 313 

Hermite differential equation, 76, 117 
functions, 117, 120, 343 
polynomials, 76, 117 
Herzberg, 563 

Heterogeneous equilibrium, 26 
Hicks, 483 

High eigenvalues, distribution of, 260 
Hilbert-Schmidt method for integral 
equations, 510 

Hilbert, 211, 253, 256, 265, 267, 363, 
504, 510 

Hobson, 167, 186, 463 
Homogeneous, meaning of term, 45 
Homogeneous equations, 299 
gas reactions, 36 
integral equation, 509 
polynomial, 302 
Homo-polar binding, 408 
Horner's method for roots of a polyno- 
mial, 477 
Howard, 285 
Hughes, 397 
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Hydrodynamics, 166, 175, 185 
Hydrogen atom, 125, 127 
quantum mechanical treatment, 347 
Hydrogen molecular ion, 369 
Hydrogen molecule, 408 
Hypergeometric equation, 72, 354 
series, 72 

Hyperbolic functions, 182 

Ideal gas, 24 
ensemble for, 426 
Image, electrical, 223 
Implicit function, 6 
Improper rotation, 310, 549 
Independent systems, quantum me- 
chanics of, 398 
Index, dummy, 157 
of subgroup, 529 
precision, 425, 488 
umbral, 157 
Indicial equation, 61 
Indistinguishable objects, arrange- 
ments of, 416 
Inductance, 42 
Inertia, moment of, 272 
product of, 272 

Inhomogeneous differential equation, 
237 

equations, 299 
integral equation, 508 
Inner product of vectors, 136 
tensors, 161 
Integral, elliptic, 175 
line, 150 
surface, 151 
volume, 151 

Integral and differential equations, re- 
lation of, 514 

Integral equation, Abel's, 523 
definition of, 503 
eigenfunctions of, 509 
Fredholm's, 503 
homogeneous, 509 
inhomogeneous, 508 
kernel of, 503 
linear, 503 
non-linear, 503 
of the third kind, 503 
resolvent of, 505 
Schmidt-Hilbert method for, 510 
solution of, 504 


Integral equation, summary of methods 
of solution, 514 
use of, 514 
Volterra's, 503 

Integrating denominator, 28, 29, 84 
factor, 41 

Integration, numerical, 456 
vector, 149 
Intensive variable, 1 
Internal energy, 11, 435 
Interpolation, inverse, 454 
Two-way, 454 

Interpolation for equal values of the 
argument, 450 

unequal values of the argument, 453 
Interpolation formula, 451 
Bessel's, 453 
differentiation, 455 
Lagrange’s, 453 
Newton's, 452 
Stirling's, 453 

Invariance of wave equation, 560 
Invariant, 158 
subgroup, 529 

Inverse of a group element, 526 
interpolation, 454 
Inversion, 549 

Irreducible representations and eigen- 
values, 562 
Isomorphism, 530 
Isoperimetric problems, 203 
Isothermal process, 12 
Isotope effect, 397 
Iterated kernels, 505 
Iteration method for algebraic equa- 
tions, 476 

differential equations, 467 
solution of secular determinant, 486 

Jacobians, 17, 18 
Jacobi polynomial, 74, 354 
elliptic functions, 175 
Jahnkey 94, 116, 248 
James, 408, 486 
Jeans, 185 
Jordan, 313, 413 
Joule-Thomson coefficient, 23 

Kellogg, 175 
Kemble, 265, 413 
Kepler's law, 201 
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Kernel, eigenvalues of, 509 
infinite, 507 
iterated, 505 

of an integral equation, 503 
symmetric, 510 
Kinetic energy, 269 
Klein and Cayley parameters, 546 

Knesetj 210 

KoniQy 450, 481 
Kowalewski, 288, 508 
Krebs, 414 
Kron, 166* 

Kronecker delta, 101, 120, 158, 294, 
325 

Kronig, 414 

Kutta-Runge method for differential 
equations, 469 

Lagrange’s equations of motion, 269 
generalized coordinates, 269 
interpolation formula, 453 
method of undetermined multipliers, 
205 

Lagrangian equations, 201 
function, 200, 269 
Laguerre differential equation, 77 
function, 122 
associated, 78, 125, 350 
polynomial, 77, 122, 128 
associated, 78, 124, 128 
Land^, 27 

Language, classical, 319 
Laplace’s equation, 156, 203, 232 
applications, 212, 219, 221 
in 2 dimensions, 213 
in 3 dimensions, 215 
Laplacian, 148 

in cylindrical coordinates, 186 
in spherical polar coordinates, 186 
in tensor notation, 192 
Latent heat of change of pressure, 11 
Law, Gauss distribution, 487^ 
Newton’s, 186, 268 
Least action principle of, 210 
Least squares, principle of, 489# 
Legendre coefficient, 66 
differential 3quation, 61, 94, 522 
functions, associated, 229 
polynomials, 66, 94, 100, 101, 128, 
222, 256 

polynomial, roots of, 463 


Leibnitz, 197 
Lense, 211 
Lerman, 17 
Levy, 466, 474 

Lindsay, 202, 259, 415, 423, 427 
Line, coordinate, 167, 187 
element, 168, 187 
of force, 45 
integral, 1, 8, 150 
Linear dependence, 128 
equation, 299 

equations, numerical solution of si- 
multaneous, 480 
independence of vectors, 297 
integral equation, 503 
momentum in quantum mechanics, 
322 

substitution operators, 390 
transformation, 300 
variation functions, method of, 367 
vector space, 296 
velocity, 140, 271 
Liouville-Neumann series, 504 
Sturm equation, 516 
theorem, 428 
Lithium 

doubly ionized, 349 
Littlewood, 534 
London, 408, 413 
Longitude, 172 
Lovitt, 504, 505, 519 

Macdougall, 1 

Maclaurin expansion, 95, 441 
Maclaurin, Formula of Euher and, 
457 

Macmillan, 175, 268 

Magnetic field, 392 

Magnetic moment of electron, 387 

Many-body problem, 395 

Margenau, 115, 202, 259 

Mason, 175 

Massey, 414 

Mathews, 116 

Mathieu’s differential equation, 78 
Matrices, addition of, 292 
conformable, 292 
direct product of, 293 
' equivalent, 301 
multiplication of, 292 
subtraction of, 292 
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Matrix, associate, 295 
adjoint, 295 
characteristic, 303 
characteristic roots of, 304 
definition of, 291 
diagonalization of, 304, 316 
diagonal, 294, 355, 375 
eigenvalues of, 304 
eigenvectors of, 304 
Hermitian, 296, 314, 355 
mechanics, 355 

method of solution for secular deter- 
minants, 485 
non-singular, 542 
null, 293 
orthogonal, 296 
partition of, 293 
rank of, 292 
reciprocal, 295 
rectangular, 292 
singular, 291 
symmetric, 296 

symmetric and skew symmetric, 
296 

trace of, 294 
transform of, 302 
transposed, 294 
unit, 294 
unitary, 296, 315 
MayeTj J. E., 415, 449, 461 
Mayer, M. G., 415, 449, 461 
Maxima in a tabulated function, 455 
Maximum area, 205 
volume, 206 

Maxwell-Boltzmann distribution law, 
433, 440 

Maxwell, 15, 180, 185 
Max weirs relations, 15 
McConnell, 166 
Mean in phase space, 429 
of a function, 420 
of aggregate of measurements, 330 
Measure of precision, 425 
Measurements, rejection of, 499 
weight of, 497 
Mechanical work, 137 
Mechanics, 175, 268 
statistical, 415 
Mellor, 477 

Membrane, vibrating circular, 248 
Method, Gauss^, 462 


Method of averages for empirical for- 
mulas, 499 
least squares, 500 

iteration for algebraic equations, 476 
Newton- Raphson for algebraic equa- 
tions, 475 

'^regida falsi for algebraic equations, 
474 

Microcanonical ensemble, 430 
Microscopic state, 437 
Milne-Thomson, 175, 185 
Milners method for differential e(iua- 
tions, 472 

Minima in a tabulated function, 455 
Minimum value of integral, 193 
surface of revolution, 198 
Minor of determinant, 289 
Mixed-continuous group, 542 
Mixed tensor, 158 
Mobius strip, 151 
Molecular spectroscopy, 563 
Molecule, diatomic, 560 
motion of, 276 
polyatomic, 268 
potential energy of a, 281 
quantum mechanical Hamiltonian of 
a, 285 

rotational motion of a, 278 
space, 427 

translational motion of a, 277 
vibrational energy of a, 280 
vibrational frequency of a, 282 
vibrational motion of a, 277 
Moment of a Force, 140 
of aggregates of measurements, 330 
of a probability distribution, 421 
of inertia, 272 
of momentum, 271 
Moment theorem, 421 
Momentum, angular, 271 
generalized, 269 
moment of, 271 

Motion, Hamilton's equations of, 270 
Lagrange^s equations of, 269 
Newton's laws of, 186, 268 
of a molecule, 276 
Morse, 250, 413 
Mott, 414 
Muir, 288 

Multiplication of determinants, 290 
matrices, 292 
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Multiplication of determinants, ten- 
sors, 160 

Multiplication table of a group, 527 
Murnaghariy 268, 531, 532, 535, 544 
Murphy, 554, 563 
Mu-space, 427 

Nebengruppe, 529 
Negative kinetic energy, 342 
Neumann function, 76 
Liouville series, 504 
Neutrons, 402 

Newton’s binomial expansion, 417 
equations of motion, 186, 268 
interpolation formula, 452 
probability distribution, 422 
Newton-Cotes formula, 459 
Newton-Raphson, method for alge- 
braic equations, 475 
roots of a polynomial, 477 
Nielsen, H. //., 286 
Nielsen, N., 116 

Non-orthogonal coordinate systems, 
187 

Non-singular matrices, 542 
Normal coordinates, 278, 311 
divisor, 529 
mode of vibration, 282 
Normalization of functions, 243 
Nucleus, atomic, 268 
of an integral equation, 503 
Numbers, Bernoulli, 457 
Numerical determination of roots of 
polynomial, 477 
differentiation, 455 
evaluation of determinants, 482 
integration, 456 

solution of differential equations, 465 
secular determinants, 483 
simultaneous equations, 476-480 
transcendental equations, 474 

Oblate spheroidal coordinates, 177 
Observability, essential, 318 
Observable, 319, 322 
Occupation numbers, 437 
Odd function, 99 
Operand, 321 

Operations composing crystallographic 
groups, 554 
Operator, 48, 320, 322 


Operator, commuting, 332 
equation, 321 
Hermitian, 255, 358 
in tensor notation, 190 
vector, 169 
Orbital, 406 

Order of a differential equation, 32 
group, 527 
group element, 527 
Ordinary differential equations, 32 
Orthogonal coordinate system, 168 
matrix, 296 

transformation, 302, 309 
Orthogonality of functions, 242 
quantum mechanical eigenfunctions, 
328 

Orthogonal ization of vectors, 298 
Orthohelium, 407 
Orthonormal functions, 243 
Oscillation, forced, 54 
natural, 52 

Oscillator, anharmonic, 59 
by matric mechanics, 355 
harmonic, 121, 202, 286 
quantum mechanical treatment, 342 
Outer product of vectors, 138 
tensors, 160 

Page, 56, 135, 205 
Parabolic coordinates, 180, 181 
Paraboloidal coordinates, 179 
Parameters, Cayley-Klein, 273 
Rodrigues, 273 
Parhelium, 407 
Partial differentiation, 1 
differential equation, 211 
Particle, concept of, 318 
free, 268 
vs. wave, 319 

Particles, system of n free, 268 
restricted, 268 
Particular integral, 53 
solution, 33 

Partition function, 436, 449 
of a permutation, 539 
Pauli, 401 
principle, 395, 399 
spin theory, 386 

Pauling, 172, 351, 366, 371, 413, 414 
Peirce, 59 

Periodicity as boundary condition, 259 
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Permutations, 415 , 530 
even, 401 
odd, 401 

Perturbation theory, 371 
Pfaff differential equation, 28, 82 
Phase, 25 
integral, 436 
rule, 25 
space, 426 
velocity, 223, 382 
Phillips y 136 
Photon, 402 
Physical system, 319 
Picard method for differential equa- 
tions, 467 • 

Plane, potential due to charged, 220 
Planck’s constant, 322 
PlummeTy 490, 496 
Point function, 8 
scalar, 144, 169 
Poisson’s equation, 232, 237 
formula, 425 ** 

Polar coordinates, spherical, 172, 186 
vector, 160 

Polarization, electric, 56, 156 
Polarizability, atomic, 376 
Polyatomic molecule, 268 
Polygon, rotation of, 552 
Polynomial, complex roots of, 478 
homogeneous, 302 
method, differentiation by, 456 
for solution of secular determinants, 
483 

numerical determination of roots, 
477 

roots of the Legendre, 463 
Postulates of quantum mechanics, 321 
Potential, chemical, 25 
electrostatic, 212, 219 
energy, 171, 269, 281 
theory, 175, 186 
thermodynamic, 14 
velocity, 212, 219 

Potential due to conducting sphere, 219 
charged cylinder, 220 
charged plane, 220 
Precision index, 425, 488 
measures of, 425, 493, 496 
Principal axis transformation, 311 
Principal of least squares, 489 
Probability, 419 


Probability, aggregate, 419 
amplitude, 331, 386 
density, 420 
of phase, 428 
theory, 419 

Probability distributions, 420 
arithmetical, 420 
continuous, 420 
discrete, 420 
geometrical, 420 
Probable error, 493--^" 
of a function, 498 
Prolate spheroidal coordinates, 175 
Product, Hermitian scalar, 314 
of inertia, 272 
Product of tensors, 
inner, 161 
outer, 160 
Product of vectors, 
cross, 138 
dot, 136 
inner, 136 
outer, 138 
scalar, 136, 297 
scalar triple, 141 
skew, 138 
three vectors, 141 
triple vector, 142 
vector, 137 
Projectile, 40 
Proper rotation, 310 
Property in probability theory, 419 
Protons, 402 

Quadratic form, 302 
discriminants of, 308 
Quadrature, approximate, 457 
formulas, general remarks, 464 
Quadrupole, potential due to, 223 
Quantum, dynamics, 321 

mechanics, general discussion, 317 
number, total, 350 
statics, 321 
Quotient group, 530 

Radiation theory, 384 
Radioactive decay, 33, 43 
emission, 423 
Radius vector, 148 
Random walk, 423, 426 
Randomness, criterion of, 419 
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Rank of tensor, 157 
Raphson-Newton method for algebraic 
equations, 475 
roots of a polynomial, 477 
Rate of solution, 35 
Rayleigh-Schrodinger perturbation 
formula, 373 

Reaction, bimolecular, 36 
consecutive, 40 
homogeneous, 36 
opposing, 40 
order of, 37 
rate, 37 

termolecular, 36, 40 
unimolecular, 36 
Real eigenvalues, 329 
Reciprocal matrix, 295 
parallelipiped, 336 
vectors, 188 
Recurrence formula, 72 
Reduced mass, 397 
Reducible representation, 531 
Reduction of group representations, 
533 

Reed, 24, 4G6 

Reflection coeff'cient of potential bar- 
rier, 340 
rotary, 310 

Regular points of a differential equa- 
tion, 71 

Relative coordinates, 395 
frequencies of measured values, 330 
frequency, 419 
velocity, 275 

Relativity, theory of, 166 
Representation, associated, 540 
of groups, 531 
irreducible, 531 
orthogonality of, 532 
reducible, 531 
self-associated, 540 
Representative point, 426 
Residuals and precision measures, 496 
Residue, 88 

Residues, theorem of, 88 
Resistance, 42 

Resolvent of an integral equation, 505 
Resonance catastrophe, 55 
Reversion of series, 454 
Rice, 414 
Richtmyer, 475 


Rigid body, definition of, 270 
most general motion of, 271 
rotation of, 271 
translation of, 271 
Ritz method, 361, 363 
Robertson, 171 

Robinson, 450, 453, 478, 480 
Rodriguezs formula, 98 
parameters, 273 
Rojanski, 414 

Root mean square error, 493 
Roots of Legendre polynomial, 463 
matrix, characteristic, 304 
polynomial, numerical determination 
of, 477, 478 

secular determinant, 483 
Rope, suspended, 525 
Rosenthal, 554, 563 
Rotary reflection group, 550 
Rotation, 166 
axis of, 271 
of a rigid body, 271 
vector, 147 
group, 

three-dimensional, 545 
two-dimensional, 550 
improper, 310, 549 
proper, 310 

Rotations as groups, 546 
Rotator 

quantum mechanical treatment, 344 
rigid, 286 
Rnark, 413 
Rule, Simpson^s, 460 
trapezoidal, 460 
Weddle’s, 461 
Range, 450, 481 

Runge-Kutta method for differential 
equations, 469 
Rutledge, 456 

Saddle point, 444 
Saijvetz, 277 
Scalar, 132, 158 
field, 144 
gradient of, 145 
point function, 144, 169 
Scalar product, 136, 297 
Hermitian, 314 
triple, 141 

Scarborough, 450, 453, 454, 478 
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Schlaefli^s formula, 98, 99, 101 
Schmidt, H., 71 

Schmidt, orthogonalization method for 
functions, 259 
vectors, 298 

Schmidt-Hilbert method for integral 
equations, 510 
Schoenfliessy 554 
system of group notation, 555 
Schrddingery 181, 413 
Schrodinger equation, 171, 208, 325 
and group theory, 560 
involving time, 377, 378 
of free mass point, 334 
Schwanky 504 

Schwarz' inequality, 130, 332 
Second order differential equations, 48 
numerical solution of, 473 
Second order perturbation, 373 
Secular determinant, 405, 483 
Seitz, 81, 414, 556 

Self-adjoint differential equation, 254 
operator, 253 

Self-associated representation, 540 
Separation of center-of-mass coordi- 
nates, 395 

of variables, method of, 171, 213, 
215, 226 

Series integration, 59 
Liouville-Neumann, 504 
method for differential equations, 
466 

reversion of, 454 
Shaw, 18 

Shearing strain, 165 
Sherwood, 24, 466 
Shortley, 406, 414 
Significant figures, 450 
Similarity transformation, 302, 303 
Simpson's rule, 460 
Simultaneous differential equations, 
numerical solution of, 472 
eigenstates, 332 

equations, numerical solution of, 
476, 480 

Singlet states, 407 
Single-valuedness, 320 
Singular point of a differential equa- 
tion, 70 

solution, 33, 47 
Singularity, essential, 71 


Singularity, non-essential, 71 
Skew product, 138 
symmetric matrix, 296 
symmetric tensor, 159 
Slater, 17 
Snapshot, 318 
Soap film, 39 
Sokolnikoff, E. S,, 59 
Sokolnikoff, I. S,, 59 
Solution, rate of, 35 
singular, 47 

Solution of differential equations, nu- 
merical, 465 

of integral equations, 504, 514 
of simultaneous equations, numeri- 
cal, 476, 480 

of transcendental equations, numeri- 
cal, 474 

Sommerfeld, 115, 337, 382, 413 
Space, Hermitian vector, 313 
linear vector, 296 
Spectroscopy, molecular, 563 
Speiser, 526, 534, 535 
Sphere, moving through incompres- 
sible fluid, 219 
oscillating, 230 
Spherical harmonic, 230, 346 
polar coordinates, 172, 186 
Spheroidal coordinates, oblate, 177 
prolate, 175 

Spin angular momentum, 387 
coordinate, 387 
degeneracy, 393 
displacement operators, 391 
energy, 392 
function, 389 
in two-body problem, 406 
matrices, 389, 546 
operator, 389 
vector, 389 

Spinning electron, 386 
Sponer, 555 

Spread of measurements, 333 
Standard deviation, 333, 421/* 

Stark effect, 181, 375 
State 

quantum mechanical, 301 
time-dependent, 377 
function, 320 

intuitive meaning of 327, 331 
Stationarity condition, 195 
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Stationary path, 193 
states, 321 

Statistical mechanics, 415 
Steepest descents, method of, 443 
Steiner, 1 

Step function, 234 
Stirling’s formula, 423 
interpolation formula, 463 
theorem, 93 
Stokes’ theorem, 152 
Strain, 156, 164, 165 
Stratton, 115, 250 
Strength, field, 156 
Stress, 156 

String, homogeneous vibrating, 524 
Sturm-Liouville equation, 516, 520 
theory, 253, 266, 326 
Subgroup, 527 
conjugate, 529 
invariant, 529 
Sublimation, 38 
Subtraction of matrices, 292 
Sura of state, 436 
of tensors, 159 
of vectors, 136 
Surface, coordinate, 167 
element, 168, 187 
integral, 151 
tension, 39 

Suspension bridge, 57 
Symbol, Christoffel three-index, 162, 
191 

Symmetric eigenfunctions, 400 
group, 538 
kernel, 510 
matrix, 296 
state function, 439 
tensor, 159 
top, 352 

System, conservative, 269 
thermodynamic, 426 

Taylor series, 98 

Taylor series method for differential 
equations, 466 
Teller, 555 
Temperature, 29 
Tension strain, 165 
Tensor, associated, 162 
component of, 157 
contraction of, 161 


Tensor, contravariant, 158 
covariant, 158 
eovariant derivative of, 164 
differentiation of, 162 
length of, 161 
mixed, 158 
first rank, 157 
product, inner, 161 
product, 160 
skew-symmetric, 159 
symmetric, 159 

Tensor notation, differential operator's 
in, 190 

divergence in, 191 
gradient in, 191 
Laplacian in, 192 
curl in, 192 

Tensors in curvilinear coordinates, 187 
difference of, 159 
perpendicular, 161 
sum of, 159 

Thermodynamic derivatives, 15 
potential, Gibbs, 14 
relations, 434 
system, 426 
variables, 1 
laws of, 11 
second law of, 13 
Thermal conductivity, 35 
flux, 146 

Theorem, Gauss’, 154 
Green’s, 156 
of divergence, 154 
of residues, 441 
Stokes’, 152 
Thomas, L. H., 474 
Thomas, T. Y,, 166 
Thomson, J. J., 202 
Three-index symbol, Christoffel, 162 
Time-dependent states, 377 
Titchmarsh, 246 
Tolman, 415, 427 
Top spherical, 355 
symmetrical, 352 
Toroidal coordinates, 185 
Torrance, 59, 151 
Total differentials, 3, 8 
Trace, 534 
of Matrix, 294 

I Transcendental equations, numerical 
[ solution of, 474 
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Transformation, collineatory, 302 
congruent, 302, 307 
conjunctive, 302 
linear, 300 
orthogonal, 309 
principal axis, 311 
real orthogonal, 302 
similarity, 302, 303 
unitary, 302, 544 

Transform of a group element, 528 
of a matrix, 302 
Transients, 43 
Transition Probability, 386 
forbidden, 386 
Translation, 166, 

Translation of a molecule, 277 
of a rigid body, 271 
Transmission coefficient of barrier, 341 
Transparency factor of a potential 
barrier, 342 

Transposed matrix, 294 
Transposition, 538 
Trapezoidal rule, 460 
Triple product, scalar, 141 
vector, 142 
Triplet states, 407 
Tunell, 12 
Tunnel effect, 340 
Turnbull j 301, 307 
Tschebyscheff polynomial) 74, 128 
differential equation, 73 
Two-body problem in quantum me- 
chanics, 397 

Vhlenbeck, electron spin, 387 
Umbral index, 157 

Uncertainty in angular momentum, 334 
Uncertainty principle, 332, 378 
Unimodular unitary group, 543 
Unitary group, 542 
matrix, 296, 315 
transformation, 302, 544 
Unit element in group theory, 526 
matrix, 294 
vectors, 135, 169 
Urey, 413 

Value of a physical quantity, most 
probable, 487 
true, 487 

Van der WaaPs equation, 5, 24 


Van der Waerden^ 560, 563 
Van Vleck, 376, 414 
Variable, canonically conjugate, 270 
cogredient, 303 
contragredient, 302 
extensive, 1 
independent, 32 
intensive, 1 
thermodynamic, 1 

Variables, Method of separation of, 
171 

Variation, 194 
Variational Method, 361 
Variations, calculus of, 193 
Variation theory of eigenvalue prob- 
lems, 256 
Vector area, 139 
axial, 160 
column, 291 
components of, 132 
contra variant, 157 
covariant, 158 
curl of, 147 

curvilinear component of, 169 

differentiation of, 143 

divergence of, 146 

field, 144 

integration, 149 

irrotational, 149 

length of, 132 

magnitude of, 132 

operator, 169 

origin of, 132 

polar, 160 

product, 137 

radius, 148 

row, 291 

solenoidal, 149 

sum, 136 

space, Hermitian, 313 
linear, 296 
terminus of, 132 
triple product, 142 
unit, 135, 169 
Vectors, base, 188 
difference of, 136 
linear independence of, 297 
orthogonalization of, 298 
products of three, 141 
reciprocal, 188 
scalar product of, 136, 297 














